- Know basic data processing
- Some familiarity with Excel
- General knowledge about data manipulation
nitroproc is a free data science and machine learning app for iOS, Android and Mac that allows you to work with massive datasets using a disk approach (similar to what Hadoop does). It can do very powerful data processing, or advanced machine learning techniques such as random forests or gradient boosting. It allows you to deploy the same scripts and files in any device, using the local processor – whether that is a phone, tablet or PC; thus avoiding the necessity to use a server.
It excels in phones and tablets, since it allows you to design your scripts by tapping on the screen. These scripts are automatically generated according to the instructions that you introduce. Obviously, if you are using nitroproc on a PC or MacOS, you will need to write the code manually.
It is designed to work with comma delimited files (csv files), and its programming syntax is very simple. In this course we will explore how to work with nitroproc, how to code different scripts, and how to use the included examples.
As nitroproc uses disk based data processing algorithms, it is specially suited for handling massive files ( several GB). This is a beautiful feature, as typically R or Python crash when working with monster data, unless you use very specific tricks.
When running a script, nitroproc will produce a very informative log describing what was done on each step. It is extremely useful for finding bugs, debugging, and getting a quick glimpse into what is generated at each step. In this course we will see how you can benefit from these log files, and how you can improve your scripts by intelligently using them.
nitroproc can be connected to Python or R (when using a PC or MacOS), using a very straightforward mechanism. We will see how you can do your data processing in nitroproc, and then do some statistical analysis in R or Python.
The student will learn how to easily integrate this software into his framework. No programming experience is necessary as the syntax is quite simple. It will be even easier if using the Android or iOS versions.
Its Android and iOS versions include lots of useful examples showing how to do practically everything that can be done in nitroproc. For machine learning (ML) in particular, it even includes some real datasets such as the iris dataset, which we will use to explain how to do random forests. Who this course is for:
- People interested in data science, big data and analytics
- People who need to work routinely with big csv files
- Aspiring or professional data scientists