Connecting to SQL Server using RODBC package in R

RODBC is a useful package for making SQL queries to SQL Server database for R users.

To start using this package, first we need to make sure it is loaded into R studio.


is.installed <- function(mypkg) is.element(mypkg, installed.packages()[,1])

if(!is.installed('RODBC'))
{
install.packages("RODBC")
}

library(RODBC)

Now that the package is installed/loaded into our environment, let’s start making SQL queries.

First we will need to provide credentials for accessing our SQL Server Database for R. Mainly we will need to have correct info for:

  1. Server name
  2. Database name
  3. Login Id
  4. Password

Providing connection info:


conn <- odbcDriverConnect('driver={SQL Server};
server=SOVITHOME-PC\\RAY;
database=EmergencyRMDb;
uid=sa;
pwd=myPwd***')

Now the “conn” object holds connection for SQL Server. Using this object we can make different kinds of queries supported by RODBC. Let’s perform some common queries like:

  • SELECT statements
  • SELECT statements with WHERE Clause
  • UPDATE statements
  • SQL JOIN statements

# print row count of tables
sqlQuery(conn, "select count(*) from EQData", as.is=T)

# print name of columns
colnames(tblData)

# select all data from a table
tblData &lt;- sqlQuery(conn, "select * from EQData", as.is=T)
#preview data
head(tblData)

<img class="wp-image-542" src="http://sovitpoudel.com.np/wp-content/uploads/2017/03/result1-300x68.png" alt="" width="850" height="193"> Result from query


# Viewing columns and rows

# [x,y] -> x represents the row number and y represents the column number

tblData[,1:5] # view columns 1 through 5

tblData[2,] # view first two rows

tblData$Latitude[2] # view second value for specified column
# finding by matching row(s) value
tblData[tblData$Epicentre == "Kathmandu",]

# update tables
sqlQuery(conn, "update EQData set latitude = '27.75' where epicentre = 'kathmandu'")

SQL JOIN statements can also be easily performed using “merge” function.


# perform inner join
newT <- merge(tblA, tblB, by = "key_id")
dim(newT)

# perform outer join
newOuterT <- merge(tblA, tblB, by = "key_id", all = TRUE)
dim(newOuterT)

# perform left join
newLeftT <- merge(tblA, tblB, by = "key_id", all.x = TRUE)
dim(newLeftT)

# perform right join
newRightT <- merge(tblA, tblB, by = "key_id", all.y = TRUE)
dim(newRightT)

These are some common operations that we can easily perform using R and RODBC on SQL Server tables. As you can see, most of the common operations can be performed using built-in R functions for Data Frames on R. I am just making use of “sqlQuery” function provided by RODBC for making queries on the Server.

Please find the full list of RODBC supported queries at: https://cran.r-project.org/web/packages/RODBC/RODBC.pdf

Hope this was helpful. If you got stuck on any of the steps, above please do not hesitate to leave comments below.

Thanks!

Tags: , ,