Welcome to the Big Data QuickStart!
In the modern technology landscape, it is more important than ever to be able to work effectively with massive amounts of data. With the emergence of big data analytics, you have to be able to find useful correlations and insights from your data to take advantage of these trends for your business.
With InterSystems IRIS™, you can:
- Use the InterSystems IRIS Spark Connector — a fast, efficient data processing engine that relies on the unique features of both Apache Spark and InterSystems IRIS.
- Apply sharding — a tool to divide very large databases into smaller, faster, more easily managed parts.
- Perform machine learning algorithms on big data with high accuracy.
For this exercise, you will use an Apache Zeppelin lab preloaded with data on taxi trips. With the lab, you can load, save, and view the taxi data in the database, compare Apache Spark to SQL for handling and querying data, and apply machine learning algorithms to the data.
Launch Apache Zeppelin lab
Click the button below to launch your Apache Zeppelin lab. Make note of the login details for the Management Portal, as you may need them later, then follow the Zeppelin link to your exercise. Please be aware that the code in the exercise may take several minutes to run. Click DELETE AND RESET YOUR LAB to get a new version of your Apache Zeppelin lab.
Complete the exercise
Each exercise step in the lab includes sample code and an explanation for the code. Read the explanation and then click the Play button (▷) on the top right of the modules to execute the code. Feel free to modify the code and explanation as you want.