Data Mining Text Mining and application

What is Data Mining

 

Data Mining is the process where we can extract useful information from Huge Data. Lets us understand the term mining which means that we dig the soil and extract something same to data mining work. In Data Mining we have a bulk of data and this data is sometimes available in raw form from that data you extract the useful information that you want after analyzing. The term analysis means you have data and you want to select only the 20%  from that data then you can analyze that data and extract useful information.

In another way, you can call data mining knowledge mining because if we take it in the field of science then we can extract the data from the huge knowledge. If we talk about the present era data mining is used in every field where we have a huge amount of data. We can apply data mining to different data which include:

    1. Transactional Databases
    2. Relational Databases
    3. World Wide Web
    4. Spatial Databases
    5. Data Warehouse

The data mining is consist of three steps which is:

    1. Data Pre-Processing
    2. Data Extraction
    3. Data Evaluation

1 Data Pre-Processing

In the data pre-processing step we collect the data from different sources. When we collect the data then we start pre-processing that data. In Data Pre-Processing we clean the data if the data has duplicate values we remove duplications. If we have data available in the table then we can check that table whether has a duplicate value if has then removed it. Each row much be unique, and every column contains a single value.

2 Data Extraction

In data extraction, after pre-processing the data we start extracting the data which we need and exclude the data which we don't need. For example, if we have school data and we want to select only the topper student then we can apply data extraction and select only the topper student and exclude the other students.

3 Data Evaluation

In data evaluation, we start reporting the extracted data. We can also present that data in different formats like graphs, charts, document reports, etc.


Applications

    1. Fraud Detection
    2. Research Analysis
    3. Biological Analysis
    4. Healthcare
    5. CRM
    6. Manufacturing Engineering
    7. Market Basket Analysis
    8. Lie detection
    9. Education
    10. Financial Analysis
application of data mining


Data mining is also used to solve business problems and convert raw data to useful information that is used for further processing. In data mining, we identify the data, prepare that data, evaluate the data, and Present the data at the end.


Data Mining Types

    1. Relational Database
    2. Data Repositories
    3. Data Warehouse
    4. Object-Relational Database
    5. Transactional Database

1 Relational Database

A relational database is consist of in the form of a table. Where the table consists of rows and columns. All those tuples and columns are used to store data and whenever we need that data we access it and apply data mining to that data.

2 Data Warehouse

In simple words, we can say that the data warehouse is like a store where the shopkeeper stores its product and used it when they are required. Data warehouse also works the same as that. The data we used is collected from different sources. Data warehouses are used by banks, brands, CCTV Recordings, and much more.

3 Data Repository

Data Repository is a place or storage where we store data. We can say that Database work as storage because many applications used databases for storing data and also many organizations also used databases for storage purposes.

Benefits of Data Mining

  1. If we use data mining then we can save a lot of costs so we can say that this is cost-effective.
  2. If we use data mining then we can easily take a decision because when we apply data mining we get analyzed data based on analyzed data we take action.
  3. If we use data mining we can add this to the already available system and also we can include it in the new one which is the best advantage of data mining.
  4. Nowadays we are looking for a system that works fast so our wait is over data mining process the data very fast and provide quick result.
  5. It also provides useful mean knowledge-based information that is very important for any organization.
  6. Data mining plays an important role in the finding of hidden patterns meaning those data that have not been accessed by anyone are discovered by data mining.
  7. Data mining also analyzes the behavior.

The drawback of Data Mining

The most and main drawback of data mining is that if any organization has huge data then there must have a chance that they can sell that important data to the other organization for the seek of money. They don't care about anyone's privacy they just think about themself and sell our private data to other companies just for money.

Let's take an example to understand this point as we know that if we are using any social media or we are watching any video on YouTube, Facebook, Instagram, or somewhere else then we close that app and after some time we again open our mobile and suddenly we open any app then we can see that the same adds or suggestion is showing what are we watching before.

The second main drawback is that for data mining we used tools but most of the tools are very difficult that are not used by humans so to overcome this problem we need special practice on that tools.

Text Mining

Text mining is the process of extraction of useful text from articles/documents. We can also call it information mining. In text mining, we have unstructured data and we apply mining to that data and extract the useful information and make that data structured. We can use text mining for extracting relevant text or words that are available in the data.

When we start text mining the data we apply clustering to the data. If you don't know what is clustering then read this article you will understand. In clustering, we separate similar words and make a cluster of that words.

In text mining, we have a lot of Unstructured Data available on Google and many other Platforms. 

We can use that data for Text Mining and extracting useful information.

In the text, we have data available in three formats.

  1. Structured data
  2. Semi-Structured data
  3. Unstructured data

What is Structured Data?

In structured Data the material or data we have is in an organized form we can easily understand that data. If our data is in an organized form then we can easily access that data and use it. The data we have in table form is known as structure data.

What is Semi Structure Data?

In semi-structure, the data we have is available in XML and java files.

What is Unstructured Data?

In Unstructured data, the data we have is in raw format. The data we can see on different websites is in an unstructured format. All the files, images, and Videos are unstructured data.

Why do we use Data Mining?

Data mining is very important for extracting important data. If we have data in the big form then we can apply data mining to that data for getting useful information and after that, we use that information for taking decisions.

Post a Comment

0 Comments