Saturday, November 28, 2020

What is R CSV Files and how to use CSV files?

 A Comma-Separated Values (CSV) file is a plain text file which contains a list of data. These files are often used for the exchange of data between different applications. For example, databases and contact managers mostly support CSV files.

These files can sometimes be called character-separated values or comma-delimited files. They often use the comma character to separate data, but sometimes use other characters such as semicolons. The idea is that we can export the complex data from one application to a CSV file, and then importing the data in that CSV file to another application.

Storing data in excel spreadsheets is the most common way for data storing, which is used by the data scientists. There are lots of packages in R designed for accessing data from the excel spreadsheet. Users often find it easier to save their spreadsheets in comma-separated value files and then use R's built-in functionality to read and manipulate the data.

R allows us to read data from files which are stored outside the R environment. Let's start understanding how we can read and write data into CSV files. The file should be present in the current working directory so that R can read it. We can also set our directory and read file from there.

R CSV Files

Getting and setting the working directory

In R, getwd() and setwd() are the two useful functions. The getwd() function is used to check on which directory the R workspace is pointing. And the setwd() function is used to set a new working directory to read and write files from that directory.

Let's see an example to understand how getwd() and setwd() functions are used.

Example

  1. # Getting and printing current working directory.  
  2. print(getwd())  
  3. # Setting the current working directory.  
  4. setwd("C:/Users/ajeet")  
  5. # Getting and printingthe current working directory.  
  6. print(getwd())  

Output

R CSV Files

Creating a CSV File

A text file in which a comma separates the value in a column is known as a CSV file. Let's start by creating a CSV file with the help of the data, which is mentioned below by saving with .csv extension using the save As All files(*.*) option in the notepad.

Example: record.csv

  1. id,name,salary,start_date,dept  
  2. 1,Shubham,613.3,2012-01-01,IT  
  3. 2,Arpita,525.2,2013-09-23,Operations  
  4. 3,Vaishali,63,2014-11-15,IT  
  5. 4,Nishka,749,2014-05-11,HR  
  6. 5,Gunjan,863.25,2015-03-27,Finance  
  7. 6,Sumit,588,2013-05-21,IT  
  8. 7,Anisha,932.8,2013-07-30,Operations  
  9. 8,Akash,712.5,2014-06-17,Financ  

Output

R CSV Files

Reading a CSV file

R has a rich set of functions. R provides read.csv() function, which allows us to read a CSV file available in our current working directory. This function takes the file name as an input and returns all the records present on it.

Let's use our record.csv file to read records from it using read.csv() function.

Example

  1. data <- read.csv("record.csv")  
  2. print(data)  

When we execute above code, it will give the following output

Output

R CSV Files

Analyzing the CSV File

When we read data from the .csv file using read.csv() function, by default, it gives the output as a data frame. Before analyzing data, let's start checking the form of our output with the help of is.data.frame() function. After that, we will check the number of rows and number of columns with the help of nrow() and ncol() function.

Example

  1. csv_data<- read.csv("record.csv")  
  2. print(is.data.frame(csv_data))  
  3. print(ncol(csv_data))  
  4. print(nrow(csv_data))  

When we run above code, it will generate the following output:

Output

R CSV Files

From the above output, it is clear that our data is read in the form of the data frame. So we can apply all the functions of the data frame, which we have discussed in the earlier sections.

R CSV Files

Example: Getting the maximum salary

  1. # Creating a data frame.  
  2. csv_data<- read.csv("record.csv")  
  3.   
  4. # Getting the maximum salary from data frame.  
  5. max_sal<- max(csv_data$salary)  
  6. print(max_sal)  

Output

R CSV Files

Example: Getting the details of the person who have a maximum salary

  1. # Creating a data frame.  
  2. csv_data<- read.csv("record.csv")  
  3.   
  4. # Getting the maximum salary from data frame.  
  5. max_sal<- max(csv_data$salary)  
  6. print(max_sal)  
  7.   
  8. #Getting the detais of the pweson who have maximum salary  
  9. details <- subset(csv_data,salary==max(salary))  
  10. print(details)  

Output

R CSV Files

Example: Getting the details of all the persons who are working in the IT department

  1. # Creating a data frame.  
  2. csv_data<- read.csv("record.csv")  
  3.   
  4. #Getting the detais of all the pweson who are working in IT department  
  5. details <- subset(csv_data,dept=="IT")  
  6. print(details)  

Output

R CSV Files

Example: Getting the details of the persons whose salary is greater than 600 and working in the IT department.

  1. # Creating a data frame.  
  2. csv_data<- read.csv("record.csv")  
  3.   
  4. #Getting the detais of all the pweson who are working in IT department  
  5. details <- subset(csv_data,dept=="IT"&salary>600)  
  6. print(details)  

Output

R CSV Files

Example: Getting details of those peoples who joined on or after 2014.

  1. # Creating a data frame.  
  2. csv_data<- read.csv("record.csv")  
  3.   
  4. #Getting details of those peoples who joined on or after 2014  
  5. details <- subset(csv_data,as.Date(start_date)>as.Date("2014-01-01"))  
  6. print(details)  

Output

R CSV Files

Writing into a CSV file

Like reading and analyzing, R also allows us to write into the .csv file. For this purpose, R provides a write.csv() function. This function creates a CSV file from an existing data frame. This function creates the file in the current working directory.

Let's see an example to understand how write.csv() function is used to create an output CSV file.

Example

  1. csv_data<- read.csv("record.csv")  
  2.   
  3. #Getting details of those peoples who joined on or after 2014  
  4. details <- subset(csv_data,as.Date(start_date)>as.Date("2014-01-01"))  
  5.   
  6. # Writing filtered data into a new file.  
  7. write.csv(details,"output.csv")  
  8. new_details<- read.csv("output.csv")  
  9. print(new_details)  

Output

R CSV Files

No comments:

Post a Comment

How to DROP SEQUENCE in Oracle?

  Oracle  DROP SEQUENCE   overview The  DROP SEQUENCE  the statement allows you to remove a sequence from the database. Here is the basic sy...