Data preparation and cleansing is a tiresome topic!
It takes up a lot of time, but needs to be considered to conduct a meaningful analysis and deliver correct analysis reports. In order to automate this data preparation process for your own raw data, InfoZoom can be executed via command line.
8447779800, Low rate Call girls in Saket Delhi NCR
InfoZoom Tipps & Tricks – Automated data preparation via command line
1. InfoZoom Tips & Tricks – Part 3
Automated data preparation via command line
2. About corma GmbH
Stops suspects by:
analytical investigations
operative investigations
Saves time by:
online research
online monitoring
Increases efficiency
and saves money by:
data analytics
global intelligence
solutions
2
3. Sample scenario
Automation of the following steps
1.
Import raw data in to a predefined InfoZoom template
2.
Cleanse raw data via predefined queries
3.
Create and save several data extracts as CSV files from the cleansed file
for the import into a database
Note:
All steps need to be conducted manually, before creating the command line!
•
Create InfoZoom template for raw data incl.
o Attribute groups, formulas, analysis cubes, queries
•
Report output
o Can be implemented into the queries (optional)
Possible formats: Excel table, CSV file, TXT file, InfoZoom file
3
5. Step 2
Create queries to cleanse the raw data
• e.g. delete all characters except of numbers from phone numbers
•
Or delete all spaces at the beginning and end and use upper case
5
6. Step 3
Save data extracts as CSV files via queries
• Perform selection:
o
Exclude blank data entries for CSV data extracts
6
7. Step 4
Command line parameters
•
•
•
•
•
•
•
•
Open text editor and save file as *.cmd
In the first line, copy the path of the InfoZoom.exe on the „C“ drive
Perform InfoZoom in the background
o
Command: -invisible
Open predefined template
o
Template name (if the name contains spaces, it needs to be enclosed by
quotation marks)
Import raw data in the previously opened template
o
Command: -insert -d ";" (-d = delimiter, needs to be enclosed by quotation
marks: semicolon, circumflex etc.)
Execute predefined queries
o
Command: -query „query name“(if the name contains spaces, it needs to be
enclosed by quotation marks)
Save selection as CSV file
o
Command: -saveObjectsAscsv , „ORGA URL.csv“ (delimiter and name of CSV file)
Close InfoZoom in the background
o
Command: -exit
7
8. Step 5
Assemble command line
InfoZoom.exe -invisible Sample_Data.fot
–insert –d „^“ Sample_Data.csv
–query Country_cleansing
-query Exclude_blank_Orga -saveObjectsAscsv , ORGA.csv
-query Exclude_Blank_URL -saveObjectsAscsv , „ORGA URL.csv“
-query Exclude_blank_Address -saveObjectsAscsv , „ORGA ADDRESS.csv“
-query Exclude_blank_Contact -saveObjectsAscsv , „ORGA CONTACT.csv“
-exit
Legend
Commands
Template and query names
Raw data
Delimiter for new created CSV files
Names of new created CSV files
8
9. Result
Command line in text editor
•
All commands needs to be in one line without any line breaks!
•
The command line will be executed by double click on the *.cmd file
DOS window shows executed commands
•
•
Result: Created CSV files
Preparation time without command line: apprx. 4 hours
Preparation time with command line:
apprx. 30 minutes
9
10. InfoZoom Trainings
InfoZoom online trainings
• IZ50 InfoZoom Web-Starter-Seminar
• IZ51 InfoZoom Web-Expert-Seminar
o Overview of all training dates can be found here:
http://infozoom-online-training.de/content/infozoomonline-training-trainings.html
10