ABSTRACT A study needs tocollect data from different sources and testing those data with some techniquesfor predict or decision making process. It uses machine learning,Statistical and visualization techniques to discovery and present knowledge in a formwhich is easily comprehensible to humans. They scour databases for hidden patterns,finding predictive information that experts may Miss because it lies outsidetheir expectations.
Researchers needs these kind of toolfor analysis their data. In this paper, we are discussed about variousavailable data mining tools. This paper presents an overview of the datamining tools like, WEKA, ORANGE RAPID MINER, TANAGRA, KNIME. KEY WORDS: Data Mining tools, WEKA, TANAGRA, KNIME , RAPID MINER, ORANGE. datapreprocessing, data analysis.INTRODUCTION: Data mining means extractingknowledge from various source. in that data mining using many tools for theprediction and classification process.
that are weka, orange kmine etc., data mining tools are available which improve data quality from past topresent. its used to analysing and resulting the set of data in the datamining. in this paper I explained data miningand data mining tools and the study of weka orange and another four data miningtools clearly and effective manner .in the research explain the tools in thedata mining. DATAMINING: Data mining is about going from data to information ,informationthat can give you useful prediction. datamining automate the discovery of relevant patterns in a database, using clear approaches and algorithms to look into presentand past data that can then be analyzed to expect future trends.
CATEGORIES OF DATA MINING TOOLS There are mostlythree different categories of data mining tools. Traditional data mining tools,dashboards, text mining tools. § TraditionalData Mining Tools. Traditional data miningprograms help companies institute data patterns and trend by using a number ofcomplex algorithms and technique. Some of these tools are install on thedesktop to monitor the data and emphasize trends and others detain information resideoutside a database. § Dashboards: installin computers to check information in a database, dashboards reflect datachanges and updates on screen often in the form of a chart or table enabling the user to see how the business isperforming.
§ Text-miningTools. The third type of data mining tool sometimesis called a text-mining tool because of its ability to mine data from differentkinds of text from Microsoft Word andAcrobat PDF documents to simple text files, for example. DATA MINING TOOLS Some of tools which are availablein market are describes as follows like: WEKA, ORANGE, KMINE, RAPID MINER, TANAGRA. WEKATOOL WEKA, officially called Waikato Environment for KnowledgeLearning, is a computer program that was developed at the University of Waikatoin New Zealand to perform identifying information from raw data.
The data iscollected from agricultural domains.WEKA supports a lot of different standard data mining tasks suchas data preprocessing, regression, classification, clustering, Visualizationand feature selection. The vital premise of the application is to use acomputer application which is perform machine learning capabilities. Trends andpatterns useful derived information. WEKA is also open source application.
It is user friendly which isperformed graphical interface allows for quick set up and operation. WEKAoperates on the predication that the user data to find and available as a flatfile which means each data objectrepresent a fixed number of attributes and type is specified as a alpha-numericor numeric values Specific type, normal alpha-numeric or numeric values. TheWEKA application mainly used as hidden the information from the database orfiles which is performed to use options and visual interface. The features ofthe weka .
ORANGE: Orange is a data visualization and data miningtool. its a open source data visualization and analysis tool in orange the datamining is done through visual programming and python scripting. this tool has acomponents of machine learning and also it contains add-ons for the bioinformatics. It has features for different visualization such us bar charts,trees, networks and hit maps. by combining the various visits that defining thedata analytical frame work.
the area ofspecialization is orange data mining. data mining is a core step and knowledgediscovery in data set. it includes data preparation and modelling in dataminingsoftwareis a analytical tool for analysing thedata.
the process of analysing the data in different perspectives andsummarising in to useful information. FEATURES: Openand free software Platformindependent software Programming support Scripting interfaceOpen and free software: Orange is an open source and free datamining software tool.Platform independentsoftware: Orangeis supported on various versions oflinux,Microsoft windows , and apple mac.
Programming support: Orange supports visual programming toolsfor data mining.Scripting interface: Orange provides python scripting KMINE : Kmineis predictive testing and model in data mining .its not a scripting language.itgraphical interface KNIME the Konstanz Information Miner, is an opensource data analytics, reporting and absorption platform.
KNIME integratesa variety of works for machine learning and data mining throughits modular data pipelining concept. A graphical user interface allowsassembly of nodes for data preprocessing (ETL: Extraction, Transformation,Loading), for modelling and data analysis and revelation without, or with onlyminimal, programming. To some extent KNIME can be considered as an SAS different. RAPID MINER: Rapid Miner Studio combinetechnology and applicability to serve a user-friendly combination of the newestas well as recognized data mining techniques. is a software platform developedby the company of the same name that provides an integrated situation formachine learning, data mining, text mining, predictive analytics and businessanalytics. Defining analysis processes with Rapid Miner Studio is done by dragand dropof operators, setting parameter andcombining operators. It is used for business and industrial applications aswell as for research, education, training, rapid prototyping, and applicationdevelopment and wires all steps of the data mining process. But also methods oftext mining, web mining, the automatic sentiment analysis from Internetconversation forums (sentiment analysis, opinion mining) as well as the timeseries analysis and -prediction are available.
Rapid Miner uses a client/servermodel with the server offered as Software as a Service or on cloudinfrastructures.TANAGRA: Tanagra is open source data study software for academic and studypurposes which propose some data mining methods from investigative dataanalysis, numerical learning ,machine learning and database area. Themajor reason ofTanagra is to provide platform for researchers and students to use data mining software ineasy way by compliantto the present norms of the software development and allow to analyze either real or artificial data. This lean-toof SIPINA provides the users an easy to use interface for the analysis of anyreal or reproduction data. It allows the researchers to simply add their owndata mining research method or any newly identified data mining processingtechnique and also ropes by providing them with architecture and a means tocompare their methodology performance. It provides the basic or naives with astage where they can carry out their tentative procedures . CONCLUSION: Inpaper discussed definition of data mining and tools in data mining in the useof data preprocessing and the classification of data mining and also discussedin the three categories of data mining tools. in this paper explained five datamining tools that are weka orange, kmine etc.
, and how will works in the data mining processlike prediction and classification in data mining. In future we will use tools for specific any techniques and try to findout the efficiency and accuracy of result which we will examine on various tools.