In the relational database management system, the data is stored in a normalized format. Therefore, to complete statistical computing, we need very advanced and complex SQL queries. The large and huge data which is present in the form of tables require SQL queries to extract the data from it.
R can easily connect with many of the relational databases like MySql, SQL Server, Oracle, etc. When we extract the information from these databases, by default, the information is extracted in the form of data frame. Once, the data comes from the database to the R environment; it will become a normal R dataset. The data analyst can easily analyze or manipulate the data with the help of all the powerful packages and functions.
RMySQL Package
RMySQL package is one of the most important built-in package of R. This package provides native connectivity between the R and MySql database. In R, to work with MySql database, we first have to install the RMySQL package with the help of the familiar command, which is as follows:
- install.packages("RMySQL")
When we run the above command in the R environment, it will start downloading the package RMySQL.
Output
We have created a database employee in which there is a table employee_info, which has the following record.
We will use the data which we have mentioned above in our upcoming topics.
Create a connection between R and MySql
To work with MySql database, it is required to create a connection object between R and the database. For creating a connection, R provides dbConnect() function. This function takes the username, password, database name, and host name as input parameters. Let's see an example to understand how the dbConnect() function is used to connect with the database.
Example
- #Loading RMySQL package into R
- library("RMySQL")
-
- # Creating a connection Object to MySQL database.
- # Conneting with database named "employee" which we have created befoe with the helpof XAMPP server.
- mysql_connect = dbConnect(MySQL(), user = 'root', password = '', dbname = 'employee',
- host = 'localhost')
-
- # Listing the tables available in this database.
- dbListTables(mysql_connect)
Output
R MySQL Commands
In R, we can perform all the SQL commands like insert, delete, update, etc. For performing the query on the database, R provides the dbSendQuery() function. The query is executed in MySQL, and the result set is returned using the R fetch () function. Finally, it is stored in R as a data frame. Let's see the example of each and every SQL command to understand how dbSendQuery() and fetch() functions are used.
Create Table
R provides an additional function to create a table into the database i.e., dbWriteTable(). This function creates a table in the database; if it does not exist else, it will overwrite the table. This function takes the data frame as an input.
Example
- #Loading RMySQL package into R
- library("RMySQL")
-
- # Creating a connection Object to MySQL database.
- # Conneting with database named "employee" which we have created befoe with the helpof XAMPP server.
- mysql_connect = dbConnect(MySQL(), user = 'root', password = '', dbname = 'employee',
- host = 'localhost')
-
- #Creating data frame to create a table
- emp.data<- data.frame(
- name = c("Raman","Rafia","Himanshu","jasmine","Yash"),
- salary = c(623.3,915.2,611.0,729.0,843.25),
- start_date = as.Date(c("2012-01-01", "2013-09-23", "2014-11-15", "2014-05-11","2015-03-27")),
- dept = c("Operations","IT","HR","IT","Finance"),
- stringsAsFactors = FALSE
- )
-
- # All the rows of emp.data are taken inot MySql.
- dbWriteTable(mysql_connect, "emp", emp.data[, ], overwrite = TRUE)
Output
Select
We can simply select the record from the table with the help of the fetch() and dbSendQuery() function. Let's see an example to understand how to select query works with these two functions.
Example
- #Loading RMySQL package into R
- library("RMySQL")
-
- # Creating a connection Object to MySQL database.
- # Conneting with database named "employee" which we have created befoe with the helpof XAMPP server.
- mysql_connect = dbConnect(MySQL(), user = 'root', password = '', dbname = 'employee',
- host = 'localhost')
-
- # selecting the record from employee_info table.
- record = dbSendQuery(mysql_connect, "select * from employee_info")
-
- # Storing the result in a R data frame object. n = 6 is used to fetch first 6 rows.
- data_frame = fetch(record, n = 6)
- print(data_frame)
Output
Select with where clause
We can select the specific record from the table with the help of the fetch() and dbSendQuery() function. Let's see an example to understand how to select query works with where clause and these two functions.
Example
- #Loading RMySQL package into R
- library("RMySQL")
-
- # Creating a connection Object to MySQL database.
- # Conneting with database named "employee" which we have created befoe with the helpof XAMPP server.
- mysql_connect = dbConnect(MySQL(), user = 'root', password = '', dbname = 'employee',
- host = 'localhost')
-
- # selecting the specific record from employee_info table.
- record = dbSendQuery(mysql_connect, "select * from employee_info where dept='IT'")
-
- # Fetching all the records(with n = -1) and storing it as a data frame.
- data_frame = fetch(record, n = -1)
- print(data_frame)
Output
Insert command
We can insert the data into tables with the help of the familiar method dbSendQuery() function.
Example
- #Loading RMySQL package into R
- library("RMySQL")
-
- # Creating a connection Object to MySQL database.
- # Conneting with database named "employee" which we have created befoe with the helpof XAMPP server.
- mysql_connect = dbConnect(MySQL(), user = 'root', password = '', dbname = 'employee',
- host = 'localhost')
-
- # Inserting record into employee_info table.
- dbSendQuery(mysql_connect, "insert into employee_info values(9,'Preeti',1025,'8/25/2013','Operations')")
Output
Update command
Updating a record in the table is much easier. For this purpose, we have to pass the update query to the dbSendQuery() function.
Example
- #Loading RMySQL package into R
- library("RMySQL")
-
- # Creating a connection Object to MySQL database.
- # Conneting with database named "employee" which we have created befoe with the helpof XAMPP server.
- mysql_connect = dbConnect(MySQL(), user = 'root', password = '', dbname = 'employee',
- host = 'localhost')
-
- # Updating the record in employee_info table.
- dbSendQuery(mysql_connect, "update employee_info set dept='IT' where id=9")
Output
Delete command
Below is an example in which we delete a specific row from the table by passing the delete query in the dbSendQuery() function.
Example
- #Loading RMySQL package into R
- library("RMySQL")
-
- # Creating a connection Object to MySQL database.
- # Conneting with database named "employee" which we have created befoe with the helpof XAMPP server.
- mysql_connect = dbConnect(MySQL(), user = 'root', password = '', dbname = 'employee',
- host = 'localhost')
-
- # Deleting the specific record from employee_info table.
- dbSendQuery(mysql_connect, "delete from employee_info where id=8")
Output
Drop command
Below is an example in which we drop a table from the database by passing the appropriate drop query in the dbSendQuery() function.
Example
- #Loading RMySQL package into R
- library("RMySQL")
-
- # Creating a connection Object to MySQL database.
- # Conneting with database named "employee" which we have created befoe with the helpof XAMPP server.
- mysql_connect = dbConnect(MySQL(), user = 'root', password = '', dbname = 'employee',
- host = 'localhost')
-
- # Dropping the specific table from the employee database.
- dbSendQuery(mysql_connect, "drop table if exists emp")
Output