How to import data in R Programming? (2024)

Importing Data in R

Importing data in R programming means that we can read data from external files, write data to external files, and can access those files from outside the R environment. File formats like CSV, XML, xlsx, JSON, and web data can be imported into the R environment to read the data and perform data analysis, and also the data present in the R environment can be stored in external files in the same file formats.

We have the perfect professional R Programming Training Course for you!

Reading CSV Files

CSV (Comma Separated Values) is a text file in which the values in columns are separated by a comma.

For importing data in the R programming environment, we have to set our working directory with the setwd() function.

For example:

setwd("C:/Users/intellipaat/Desktop/BLOG/files")

To read a csv file, we use the in-built function read.csv() that outputs the data from the file as a data frame.

Get familiar with the top R Programming Interview Questions And Answers to get a head start in your career!

For example:

read.data <- read.csv("file1.csv")print(read.data)

Output:

Sl. No.empidempnameempdeptempsalaryempstart_date
11SamIT2500003-09-2005
22RobHR3000003-05-2005
33MaxMarketing2900005-06-2007
44JohnR&D3500001-03-1999
55GaryFinance3200005-09-2000
66AlexTech2000009-05-2005
77IvarSales3600004-04-1999
88RobertFinance3400006-08-2008

Check out the Data Science Course video to learn more about its concepts:

Analyzing a CSV File

#To print number of columnsprint(ncol(read.data))

Output:

[1] 5
#To print number of rowsprint(nrow(read.data))

Output:

[1] 8
#To print the range of salary packagesrange.sal <- range(read.data$empsalary)print(range.sal)

Output:

[1] 20000 36000
#To print the details of a person with the highest salary, we use the subset() function to extract variables and observationsmax.sal <- subset(read.data, empsalary == max(empsalary))print(max.sal)

Output:

Sl. No.empidempnameempdeptempsalaryempstart_date
77IvarSales3600004-04-1999
#To print the details of all people working in Finance departmentfin.per <- subset(read.data, empdept == “Finance”)print(fin.per)

Output:

Sl. No.empidempnameempdeptempsalaryempstart_date
55GaryFinance3600005-09-2000
88RobertFinance3400006-08-2008

For the best career growth, check out Intellipaat’s R Programming Training in Sydney and get certified!

Writing to a CSV File

To write data to a CSV file, we use the write.csv() function. The output file is stored in the working directory of our R programming environment.
For example:

#To print the details of people having salary between 30000 and 40000 and store the results in a new fileper.sal <- subset(read.data, empsalary >= "30000" &amp; empsalary <= "40000")print(per.sal)

Output:

empidempnameempdeptempsalaryempstart_date
22RobHR3000003-05-2002
44JohnR&D3500001-03-1999
55GaryFinance3200005-09-2000
77IvarSales3600004-04-1999
88RobertFinance3400006-08-2008
# Writing data into a new CSV filewrite.csv(per.sal,"output.csv")new.data <- read.csv("output.csv")print(new.data)

Output:

xempidempnameempdeptempsalaryempstart_date
122RobHR3000003-05-2002
244JohnR&D3500001-03-1999
355GaryFinance3200005-09-2000
477IvarSales3600004-04-1999
588RobertFinance3400006-08-2008
# To exclude the extra column X from the above filewrite.csv(per.sal,"output.csv", row.names = FALSE)new.data <- read.csv("output.csv")print(new.data)
empidempnameempdeptempsalaryempstart_date
12RobHR3000003-05-2002
24JohnR&D3500001-03-1999
35GaryFinance3200005-09-2000
47IvarSales3600004-04-1999
58RobertFinance3400006-08-2008

Reading XML Files

XML (Extensible Markup Language) file shares both data and file format on the web, and elsewhere, using the ASCII text. Like an html file, it also contains markup tags, but the tags in an XML file describe the meaning of the data contained in the file rather than the structure of the page.

For importing data in R from XML files, we need to install the XML package, which can be done as follows:

install.packages("XML")

To read XML files, we use the in-built function xmlParse().

For example:

#To load required xml package to read XML fileslibrary("XML")&nbsp;#To load other required packageslibrary("methods")&nbsp;#To give the input file name to the functionnewfile <- xmlParse(file = "file.xml")&nbsp;print(newfile)

Output:

1Sam320001/1/2001HR2Rob360009/3/2006IT3Max420001/5/2011Sales4Ivar5000025/1/2001Tech5Robert2500013/7/2015Sales6Leon570005/1/2000IT7Samuel4500027/3/2003Operations8Jack240006/1/2016Sales
#To get the root node of xml filerootnode <- xmlRoot(newfile)#To get the number of nodes in therootrootsize <- xmlSize(rootnode)print(rootsize)

Output: [1] 8

#To print a specific nodeprint(rootnode[1])

Output:

$EMPLOYEE1Sam320001/1/2001HRattr(,"class")[1] "XMLInternalNodeList" "XMLNodeList"
#To print elements of a particular nodeprint(rootnode[[1]][[1]])print(rootnode[[1]][[3]])print(rootnode[[1]][[5]])

Output:

132000HR

Converting an XML to a Data Frame

To perform data analysis effectively after importing data in R, we convert the data in an XML file to a Data Frame. After converting, we can perform data manipulation and other operations as performed in a data frame.

For example:

library("XML")library("methods")#To convert the data in xml file to a data framexmldataframe <- xmlToDataFrame("file.xml")print(xmldataframe)

Output:

IDNAMESALARYSTARTDATEDEPT
11Sam3200001/01/2001HR
22Rob3600009/03/2006IT
33Max4200001/05/2011Sales
44Ivar5000025/01/2001Tech
55Robert2500013/07/2015Sales
66Leon5700005/01/2000IT
77Samuel4500027/03/2003Operations
88Jack2400006/01/2016Sales

Reading JSON Files

JSON (JavaScript Object Notation) file is used to exchange data between a web application and a server. They are text-based human-readable files and can be edited by a normal text editor.
Importing data in R from a JSON file requires the rjson package that can be installed as follows:

install.packages("rjson")

Now to read json files, we use the in-built function from JSON() which stores the data as a list.

For example:

#To load rjson packagelibrary("rjson")#To give the file name to the functionnewfile <- fromJSON(file = "file1.json")#To print the fileprint(newfile)

Output:

$ID[1] "1" "2" "3" "4" "5" "6" "7" "8"$Name[1] "Sam"&nbsp;&nbsp;&nbsp; "Rob"&nbsp;&nbsp;&nbsp; "Max"&nbsp;&nbsp;&nbsp; "Robert" "Ivar"&nbsp;&nbsp; "Leon"&nbsp;&nbsp; "Samuel" "Ivar"$Salary[1] "32000" "27000" "35000" "25000" "37000" "41000" "36000" "51000"$StartDate[1] "1/1/2001"&nbsp; "9/3/2003"&nbsp; "1/5/2004"&nbsp; "14/11/2007" "13/7/2015" "4/3/2007"[7] "27/3/2013"&nbsp; "25/7/2000"$Dept[1] "IT"&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "HR"&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "Tech"&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "HR"&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "Sales"&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "HR"[7] "Operations" "IT"

Check out how to use Matrices in R!

Converting a JSON File to a Data Frame

To convert JSON file to a Data Frame, we use the as.data.frame() function.
For example:

library("rjson")newfile <- fromJSON(file = "file1.json")#To convert a JSON file to a data framejsondataframe <- as.data.frame(newfile)print(jsondataframe)

Output:

IDNAMESALARYSTARTDATEDEPT
11Sam3200001/01/2001IT
22Rob2700009/03/2003HR
33Max3500001/05/2004Tech
44Ivar2500014/11/2007HR
55Robert3700013/07/2015Sales
66Leon4100004/03/2007HR
77Samuel3600027/03/2013Operations
88Jack5100025/07/2000IT

Reading Excel Files

Microsoft Excel is a very popular spreadsheet program that stores data in xls and xlsx format. We can read and write data, from and to Excel files using the readxl package in R.

To install the readxl package, run the following commandinstall.packages("readxl")For importing data in R programming from an excel file, we use the read_excel() function that stores it as a data frame.newfile <- read_excel("sheet1.xlsx)print(newfile)

Output:

IDNAMEDEPTSALARYAGE
11SAMSALES3200035
22ROBHR3600023
33MACIT3700040
44IVARIT2500037
55MAXR&D3000022
66ROBERTHR2700032
77SAMUELFINANCE5000041
88RAGNARSALES4500029

Wish to gain an in-depth knowledge of R? Check out our R programming tutorial and gather more insights!

Reading HTML Tables

HTML TABLES

You can import HTML tables into R with the following command.

# Assign your URL to `url`url <- ""# Read the HTML tabledata_df <- readHTMLTable(url, which=3)

If the above-mentioned table shows an error, you can use the following.

The following command is a combination of RCurl and XML packages.

# Activate the librarieslibrary(XML)library(RCurl)# Assign your URL to `url`url <- "YourURL"# Get the dataurldata <- getURL(url)# Read the HTML tabledata <- readHTMLTable(urldata, stringsAsFactors = FALSE)

Alternatively, you can use the rawToChar argument to convert raw objects as the following, to import data from HTML tables into R via httr package.

# Activate `httr`library(httr)# Get the URL dataurldata <- GET(url)# Read the HTML tabledata <- readHTMLTable(rawToChar(urldata$content), stringsAsFactors = FALSE)

To read HTML tables from websites and retrieve data from them, we use the XML and RCurl packages in R programming.

To install XML and RCurl packages, run the following command:

install.packages("XML")install.packages("RCurl")

To load the packages, run the following command:

library("XML")library("RCurl")

For example, we will fetch the ‘Ease of Doing Business Index’ table from a URL using the readHTMLTable() function which stores it as a Data Frame.

#To fetch a table from any website paste the urlurl <- "https://en.wikipedia.org/wiki/Ease_of_doing_business_index#Ranking"tabs <- getURL(url)#To fetch the first table, if the webpage has more than one table, we use which = 1tabs <- readHTMLTable(tabs,which = 1, stringsAsFactors = F)head(tabs)

Output:

V1V2V3V4V5V6V7V8V9V10V11V12V13
1ClassificationJurisdiction20192018201720162015201420132012201120102009
2Very EasyNew Zealand11122333322
3Very EasySingapore22211111111
4Very EasyDenmark33334555665
5Very EasyHong Kong45443222234
6Very EasySouth Korea54555788161923
V14V15V16
1200820072006
2221
3112
4578
5457
6302327

Have you got more queries? Come to our R Programming Community and get them clarified today!

We use the str() function to analyze the structure of the data frame.
For example:

str(tabs)

Output:

'data.frame':&nbsp; 191 obs. of&nbsp; 16 variables:$ V1 : chr&nbsp; "Classification" "Very Easy" "Very Easy" "Very Easy" ...$ V2 : chr&nbsp; "Jurisdiction" "New Zealand" "Singapore" "Denmark" ...$ V3 : chr&nbsp; "2019" "1" "2" "3" ...$ V4 : chr&nbsp; "2018" "1" "2" "3" ...$ V5 : chr&nbsp; "2017" "1" "2" "3" ...$ V6 : chr&nbsp; "2016" "2" "1" "3" ...$ V7 : chr&nbsp; "2015" "2" "1" "4" ...$ V8 : chr&nbsp; "2014" "3" "1" "5" ...$ V9 : chr&nbsp; "2013" "3" "1" "5" ...$ V10: chr&nbsp; "2012" "3" "1" "5" ...$ V11: chr&nbsp; "2011" "3" "1" "6" ...$ V12: chr&nbsp; "2010" "2" "1" "6" ...$ V13: chr&nbsp; "2009" "2" "1" "5" ...$ V14: chr&nbsp; "2008" "2" "1" "5" ...$ V15: chr&nbsp; "2007" "2" "1" "7" ...$ V16: chr&nbsp; "2006" "1" "2" "8" ...
#To print rows from 5 to 10 and columns from 1 to 8T1 <- tabs[5:10, 1:8]head(T1)

Output:

V1V2V3V4V5V6V7V8
5Very EasyHong Kong454532
6Very EasySouth Korea545457
7Very EasyGeorgia691624158
8Very EasyNorway786969
9Very EasyUnited States868774
10Very EasyUnited Kingdom9776810
#To find the position of India in the TableT1 <- subset(tabs,tabs$V2 == "India")head(T1)

Output:

V1V2V3V4V5V6V7V8V9V10V11V12V13V14V15V16
78EasyIndia77100130130142134132132134133122120134116

1. SPSS FILES INTO R

To initiate the SPSS files import into R, you have to install the foreign package and run the read.spss() in the final step to proceed further. The following command will complete the import.

# Activate the `foreign` librarylibrary(foreign)# Read the SPSS datamySPSSData <- read.spss("example.sav")

This works fine if you are currently using SPSS software.

The following command will come handy if you like to view the results in a data frame.

# Activate the `foreign` librarylibrary(foreign)# Read the SPSS datamySPSSData <- read.spss("example.sav", to.data.frame=TRUE, use.value.labels=FALSE)

You can set the use.value.labels argument to FALSE, if you wish to not convert value labels variables to R factors. Also, to.data.frame argument can be set to TRUE to receive output in data frame display.

2. STATA FILES

You can import stata files to R via foreign package through the following command.

# Activate the `foreign` librarylibrary(foreign)# Read Stata data into Rmydata <- read.dta("")

3. SYSTAT FILES

You can import Systat files to R via foreign package through the following command.

# Activate the `foreign` librarylibrary(foreign)# Read Systat datamydata <- read.systat("")

4. SAS FILES

To initiate the importing of SAS files into R, install the sas7bdat package and invoke the read.sas7bdat() function to proceed further.

# Activate the `sas7bdat` librarylibrary(sas7bdat)# Read in the SAS datamySASData <- read.sas7bdat("example.sas7bdat")

Alternatively, if you are using foreign library, you can initiate the import process with read.ssd() and read.xport() functions accordingly.

5. MINITAB

To import minitab (.mto) files into R, you need to install the foreign package and use the function read.mtp() to initiate the process. This can be done through the following command.

# Activate the `foreign` librarylibrary(foreign)# Read the Minitab datamyMTPData <- read.mtp("example2.mtp")

6. RDA/ RDATA

You can import your .rdata file into R through the following command.

load(".RDA")

7. READ RELATIONAL AND NON-RELATIONAL DATABASES INTO R

The following are the steps to import data from relational databases by using MonetDB.

Step 1: Create a database by using the MonetDB daemon monetdbd and a new database called “voc”
Step 2: Install MonetBD.R from R shell

> install.packages("MonetDB.R")

Step 3: Load the MonetDB.R library

> library(MonetDB.R)Loading required package: DBILoading required package: digest

Step 4: Create a connection to the database

> conn <- dbConnect(MonetDB.R(), host="localhost", dbname="demo", user="monetdb", password="monetdb")

Step 5: Create a database directly from R

> dbGetQuery(conn,"SELECT 'insert data'") single_value

Step 6: Repeat Step 4 multiple times.
Step 7: Install and load dplyr to manipulate datasets in R

> install.packages("dplyr")> library(dplyr)Attaching package: ‘dplyr’The following objects are masked from ‘package:stats’: filter, lagThe following objects are masked from ‘package:base’: intersect, setdiff, setequal, union

Step 8: Make a connection to database for dplyr

> monetdb_conn <- src_monetdb("demo") Final step: Create database for future import in R > craftsmen <- tbl(monetdb_conn, "insert data")impotenten <- tbl(monetdb_conn, "insert data")invoices <- tbl(monetdb_conn, "insert data")passengers <- tbl(monetdb_conn, "insert data")seafarers <- tbl(monetdb_conn, "insert data")soldiers <- tbl(monetdb_conn, "insert data")total <- tbl(monetdb_conn, "insert data")voyages <- tbl(monetdb_conn, "insert data")

8. IMPORTING DATA FROM NON-RELATIONAL DATABASES

The following are the steps to import data from non-relational databases to R by using MongoDB.

Step 1: Install MongoDB.

import pandas as pandasimport pymongo as pymongo df = pandas.read_table('../data/csdata.txt')lst = [dict([(colname, row[i]) for i, colname in enumerate(df.columns)]) for row in df.values]for i in range(3): print lst[i] con = pymongo.Connection('localhost', port = 27017)test = con.db.testtest.drop()for i in lst: test.save(i)

Step 2: Using RMango, write the following command.

library(RMongo)mg1 <- mongoDbConnect('db')print(dbShowCollections(mg1))query <- dbGetQuery(mg1, 'test', "{'AGE': {'$lt': 10}, 'LIQ': {'$gte': 0.1}, 'IND5A': {'$ne': 1}}")data1 <- query[c('AGE', 'LIQ', 'IND5A')]summary(data1)

Step 3: You will receive the output as the following.

Loading required package: rJavaLoading required package: methodsLoading required package: RUnit[1] "system.indexes" "test" AGE LIQ IND5A Min. :6.000 Min. :0.1000 Min. :0 1st Qu.:7.000 1st Qu.:0.1831 1st Qu.:0 Median :8.000 Median :0.2970 Median :0 Mean :7.963 Mean :0.3745 Mean :0 3rd Qu.:9.000 3rd Qu.:0.4900 3rd Qu.:0 Max. :9.000 Max. :1.0000 Max. :0

9. IMPORTING DATA THROUGH WEB SCRAPING

Step 1: Install the packages.

library(rvest)library(stringr)library(plyr)library(dplyr)library(ggvis)library(knitr)options(digits = 4)

Step 2: Using PhantomJS, command the following.

// scrape_techstars.jsvar webPage = require('webpage');var page = webPage.create();var fs = require('fs');var path = 'techstars.html'page.open('http://www.techstars.com/companies/stats/', function (status) { var content = page.content; fs.write(path,content,'w') phantom.exit();});

Step 3: Use system() function.

# Let phantomJS scrape techstars, output is written to techstars.htmlsystem("./phantomjs scrape_techstars.js")

Step 4:

batches <- html("techstars.html") %>% html_nodes(".batch")class(batches)[1] "XMLNodeSet"

Step 5:

batch_titles <- batches %>% html_nodes(".batch_class") %>% html_text()batch_season <- str_extract(batch_titles, "(Fall|Spring|Winter|Summer)")batch_year <- str_extract(batch_titles, "([[:digit:]]{4})")# location info is everything in the batch title that is not year info or season infobatch_location <- sub("\s+$", "", sub("([[:digit:]]{4})", "", sub("(Fall|Spring|Winter|Summer)","",batch_titles)))# create data frame with batch info.batch_info <- data.frame(location = batch_location, year = batch_year, season = batch_season)breakdown <- lapply(batches, function(x) { company_info <- x %>% html_nodes(".parent") companies_single_batch <- lapply(company_info, function(y){ as.list(gsub("\[\+\]\[\-\]\s", "", y %>% html_nodes("td") %>% html_text())) }) df <- data.frame(matrix(unlist(companies_single_batch), nrow=length(companies_single_batch), byrow=T, dimnames = list(NULL, c("company","funding","status","hq")))) return(df)})# Add batch info to breakdownbatch_info_extended <- batch_info[rep(seq_len(nrow(batch_info)), sapply(breakdown, nrow)),]breakdown_merged <- rbind.fill(breakdown)# Merge all informationtechstars <- tbl_df(cbind(breakdown_merged, batch_info_extended)) %>% mutate(funding = as.numeric(gsub(",","",gsub("\$","",funding))))

Step 6:

## Source: local data frame [535 x 7]#### company funding status hq location year season## 1 Accountable 110000 Active Fort Worth, TX Austin 2013 Fall## 2 Atlas 1180000 Active Austin, TX Austin 2013 Fall## 3 Embrace 110000 Failed Austin, TX Austin 2013 Fall## 4 Filament Labs 1490000 Active Austin, TX Austin 2013 Fall## 5 Fosbury 300000 Active Austin, TX Austin 2013 Fall## 6 Gone! 840000 Active San Francisco, CA Austin 2013 Fall## 7 MarketVibe 110000 Acquired Austin, TX Austin 2013 Fall## 8 Plum 1630000 Active Austin, TX Austin 2013 Fall## 9 ProtoExchange 110000 Active Austin, TX Austin 2013 Fall## 10 Testlio 1020000 Active Austin, TX Austin 2013 Fall## .. ... ... ... ... ... ... ...names(techstars)## [1] "company" "funding" "status" "hq" "location" "year"## [7] "season"

10. IMPORTING DATA THROUGH TM PACKAGE

You can initiate the import data through TM package by installing and activating it as follows.

text <- readLines("")And in the final step, write the following docs <- Corpus(VectorSource(text))

In this tutorial, we learned what importing data in R is, how to read files in different formats in R, and how to convert data from files to data frames for efficient data manipulation. In the next session, we are going to talk about datamanipulation in R.

Wish to get certified in R! Learn R from top R experts and excel in your career with Intellipaat's R Programming certification!

How to import data in R Programming? (2024)
Top Articles
Czy policja może namierzyć VPN? Pełny przewodnik | Blog VeePN
UI vs. API Load Testing: How & When to Test Each - LoadView
Katie Pavlich Bikini Photos
Gamevault Agent
Hocus Pocus Showtimes Near Harkins Theatres Yuma Palms 14
Free Atm For Emerald Card Near Me
Craigslist Mexico Cancun
Hendersonville (Tennessee) – Travel guide at Wikivoyage
Doby's Funeral Home Obituaries
Vardis Olive Garden (Georgioupolis, Kreta) ✈️ inkl. Flug buchen
Select Truck Greensboro
How To Cut Eelgrass Grounded
Pac Man Deviantart
Alexander Funeral Home Gallatin Obituaries
Craigslist In Flagstaff
Shasta County Most Wanted 2022
Energy Healing Conference Utah
Testberichte zu E-Bikes & Fahrrädern von PROPHETE.
Aaa Saugus Ma Appointment
Geometry Review Quiz 5 Answer Key
Walgreens Alma School And Dynamite
Bible Gateway passage: Revelation 3 - New Living Translation
Yisd Home Access Center
Home
Shadbase Get Out Of Jail
Gina Wilson Angle Addition Postulate
Celina Powell Lil Meech Video: A Controversial Encounter Shakes Social Media - Video Reddit Trend
Walmart Pharmacy Near Me Open
Dmv In Anoka
A Christmas Horse - Alison Senxation
Ou Football Brainiacs
Access a Shared Resource | Computing for Arts + Sciences
Pixel Combat Unblocked
Umn Biology
Cvs Sport Physicals
Mercedes W204 Belt Diagram
Rogold Extension
'Conan Exiles' 3.0 Guide: How To Unlock Spells And Sorcery
Teenbeautyfitness
Weekly Math Review Q4 3
Facebook Marketplace Marrero La
Nobodyhome.tv Reddit
Topos De Bolos Engraçados
Gregory (Five Nights at Freddy's)
Grand Valley State University Library Hours
Holzer Athena Portal
Hampton In And Suites Near Me
Stoughton Commuter Rail Schedule
Bedbathandbeyond Flemington Nj
Free Carnival-themed Google Slides & PowerPoint templates
Otter Bustr
Selly Medaline
Latest Posts
Article information

Author: Tish Haag

Last Updated:

Views: 6041

Rating: 4.7 / 5 (67 voted)

Reviews: 90% of readers found this page helpful

Author information

Name: Tish Haag

Birthday: 1999-11-18

Address: 30256 Tara Expressway, Kutchburgh, VT 92892-0078

Phone: +4215847628708

Job: Internal Consulting Engineer

Hobby: Roller skating, Roller skating, Kayaking, Flying, Graffiti, Ghost hunting, scrapbook

Introduction: My name is Tish Haag, I am a excited, delightful, curious, beautiful, agreeable, enchanting, fancy person who loves writing and wants to share my knowledge and understanding with you.