What other examples would you like to see with Spark SQL and JDBC? Please leave ideas or questions in comments below.įor more Spark tutorials, check out Spark SQL with ScalaĪnd to keep up with changes in Spark SQL, especially DataFrame vs DataSet, check the Apache Spark SQL documentation from time-to-time. This example was designed to get you up and running with Spark SQL and mySQL or any JDBC compliant database quickly. We are now in a position to run some SQL such as scala> dataframe_("select * from names").collect.foreach(println) Conclusion Spark SQL with MySQL (JDBC) Register the data as a temp table for future SQL queries scala> dataframe_mysql.registerTempTable("names")Ĥ. Let’s confirm the dataframe by show the schema of the table scala> dataframe_mysql.showģ. Almost all relational databases provide a JDBC driver, including Oracle, Microsoft SQL Server, DB2, MySQL and Postgres.
Scala> val dataframe_mysql = ("jdbc").option("url", "jdbc:mysql://localhost/sparksql").option("driver", "").option("dbtable", "baby_names").option("user", "root").option("password", "root").load()Ĭhange the mySQL url and user/password values in the above code appropriate to your environment.ģ. The JDBC source connector for Kafka Connect enables you to pull data (source) from a database into Apache Kafka®, and to push data (sink) from a Kafka topic to a database. Type in expressions to have them evaluated. 12 of the MySQL Connector/J JDBC driver. Using Scala version 2.10.4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_79) A jar file is a collection of Java classes (usually a library).jar is version 8.0. Once the Spark shell is running, let’s establish a connection to mySQL db and read the baby_names table: Welcome to $ SPARK_HOME / bin / spark - shell -jars /home/example/jars/mysql-connector-java-5.1.26.jarĢ. If it is not, you can specify the path location such as: I put mysql-connector-java-5.1.45-bin.jar file in ReadyAPI-2.1.0binext directory, after clicked 'Test Connection' for JDBC Request step, got: .SoapUIException: Failed to initialize the connection that uses the driver. This example assumes the mySQL connector JDBC jar file is located in the same directory as where you are calling spark-shell. $ SPARK_HOME / bin / spark - shell -jars mysql-connector-java-5.1.26.jar Start the spark shell with – jars argument Spark SQL with MySQL (JDBC) Example Tutorialġ.
To build and deploy and Spark application with mySQL JDBC driver you may wish to check out the Spark cluster deploy with extra jars tutorial. This is a getting started with Spark mySQL example. (In a Spark application, any third party libs such as a JDBC driver would be included in package.)įrom Spark shell we’re going to establish a connection to the mySQL db and then run some queries via Spark SQL. We need to pass in the mySQL JDBC driver jar when we start up the Spark Shell. If you have any questions about the environment setup, leave comments on this post. `id` int(11) unsigned NOT NULL AUTO_INCREMENT, The SQL to create the baby_names table: DROP TABLE IF EXISTS `baby_names`