A study needs to
collect data from different sources and testing those data with some techniques
for predict or decision making process. It uses machine learning,
Statistical and visualization techniques to discovery and present knowledge in a form
which is easily comprehensible to humans. They scour databases for hidden patterns,
finding predictive information that experts may Miss because it lies outside
their expectations. Researchers needs these kind of tool
for analysis their data. In this paper, we are discussed about various
available data mining tools. This paper presents an overview of the data
mining tools like,
Data Mining tools, WEKA, TANAGRA, KNIME , RAPID MINER, ORANGE. data
preprocessing, data analysis.
Data mining means extracting
knowledge from various source. in that data mining using many tools for the
prediction and classification process. that are weka, orange kmine etc., data mining tools are available which improve data quality from past to
present. its used to analysing and resulting the set of data in the data
mining. in this paper I explained data mining
and data mining tools and the study of weka orange and another four data mining
tools clearly and effective manner .in the research explain the tools in the
Data mining is about going from data to information ,information
that can give you useful prediction. data
mining automate the discovery of relevant patterns in a database, using clear approaches and algorithms to look into present
and past data that can then be analyzed to expect future trends.
CATEGORIES OF DATA MINING TOOLS
There are mostly
three different categories of data mining tools. Traditional data mining tools,
dashboards, text mining tools.
Data Mining Tools. Traditional data mining
programs help companies institute data patterns and trend by using a number of
complex algorithms and technique. Some of these tools are install on the
desktop to monitor the data and emphasize trends and others detain information reside
outside a database.
§ Dashboards: install
in computers to check information in a database, dashboards reflect data
changes and updates on screen often in the form of a chart or table enabling the user to see how the business is
Tools. The third type of data mining tool sometimes
is called a text-mining tool because of its ability to mine data from different
kinds of text from Microsoft Word and
Acrobat PDF documents to simple text files, for example.
DATA MINING TOOLS
Some of tools which are available
in market are describes as follows like:
WEKA, officially called Waikato Environment for Knowledge
Learning, is a computer program that was developed at the University of Waikato
in New Zealand to perform identifying information from raw data. The data is
collected from agricultural domains.
WEKA supports a lot of different standard data mining tasks such
as data preprocessing, regression, classification, clustering, Visualization
and feature selection. The vital premise of the application is to use a
computer application which is perform machine learning capabilities. Trends and
patterns useful derived information.
WEKA is also open source application. It is user friendly which is
performed graphical interface allows for quick set up and operation. WEKA
operates on the predication that the user data to find and available as a flat
file which means each data object
represent a fixed number of attributes and type is specified as a alpha-numeric
or numeric values Specific type, normal alpha-numeric or numeric values. The
WEKA application mainly used as hidden the information from the database or
files which is performed to use options and visual interface. The features of
the weka .
Orange is a data visualization and data mining
tool. its a open source data visualization and analysis tool in orange the data
mining is done through visual programming and python scripting. this tool has a
components of machine learning and also it contains add-ons for the bio
informatics. It has features for different visualization such us bar charts,
trees, networks and hit maps. by combining the various visits that defining the
data analytical frame work. the area of
specialization is orange data mining. data mining is a core step and knowledge
discovery in data set. it includes data preparation and modelling in data
is a analytical tool for analysing the
data. the process of analysing the data in different perspectives and
summarising in to useful information.
and free software
Open and free software:
Orange is an open source and free data
mining software tool.
is supported on various versions of
linux,Microsoft windows , and apple mac.
Orange supports visual programming tools
for data mining.
Orange provides python scripting
is predictive testing and model in data mining .its not a scripting language.it
graphical interface KNIME the Konstanz Information Miner, is an open
source data analytics, reporting and absorption platform. KNIME integrates
a variety of works for machine learning and data mining through
its modular data pipelining concept. A graphical user interface allows
assembly of nodes for data preprocessing (ETL: Extraction, Transformation,
Loading), for modelling and data analysis and revelation without, or with only
minimal, programming. To some extent KNIME can be considered as an SAS different.
Rapid Miner Studio combine
technology and applicability to serve a user-friendly combination of the newest
as well as recognized data mining techniques. is a software platform developed
by the company of the same name that provides an integrated situation for
machine learning, data mining, text mining, predictive analytics and business
analytics. Defining analysis processes with Rapid Miner Studio is done by drag
of operators, setting parameter and
combining operators. It is used for business and industrial applications as
well as for research, education, training, rapid prototyping, and application
development and wires all steps of the data mining process. But also methods of
text mining, web mining, the automatic sentiment analysis from Internet
conversation forums (sentiment analysis, opinion mining) as well as the time
series analysis and -prediction are available. Rapid Miner uses a client/server
model with the server offered as Software as a Service or on cloud
Tanagra is open source data study software for academic and study
purposes which propose some data mining methods from investigative data
analysis, numerical learning ,machine learning and database area. The
major reason of
Tanagra is to provide platform for researchers and students to use data mining software in
easy way by compliant
to the present norms of the software development and allow to analyze either real or artificial data. This lean-to
of SIPINA provides the users an easy to use interface for the analysis of any
real or reproduction data. It allows the researchers to simply add their own
data mining research method or any newly identified data mining processing
technique and also ropes by providing them with architecture and a means to
compare their methodology performance. It provides the basic or naives with a
stage where they can carry out their tentative procedures .
paper discussed definition of data mining and tools in data mining in the use
of data preprocessing and the classification of data mining and also discussed
in the three categories of data mining tools. in this paper explained five data
mining tools that are weka orange, kmine etc., and how will works in the data mining process
like prediction and classification in data mining. In future we will use tools for specific any techniques and try to find
out the efficiency and accuracy of result which we will examine on various tools.