Warning: Creating default object from empty value in /home3/science4/public_html/rconvert.com/Convert_SAS_To_R_Statistical_Data/wp-content/themes/optimize/functions/admin-hooks.php on line 160

SAS to R: Logging & Auditing in R Video

Video that compares SAS and R logging and demonstrates how we’ve creating logging and auditing capabilities in R.


1 Comment

SAS to R

Exploration of a common pitfall in SAS to R conversions.



Downloadable PDF: SAS_to_R_Common_Pitfalls


SAS to R

We often get asked about best practices in performing SAS to R conversions.  Below is a presentation we’ve put together to help organizations navigate switching from SAS to R.


Download the PDF: SAS_to_R_Conversion


SAS to R Conversion Assessment Tool

Rconvert, a division of Boston Decision, is pleased to report that we’ve recently completed a tool to assist with SAS to R code conversion.

Based on extensive expertise with both SAS and R programming, the tool has been developed to apply a series of best practice assessments designed to gauge the ease of conversion and detect areas where caution should be taken to ensure a successful switch from SAS to R.

Using the tool, we have begun offering a rapid SAS to R assessment, free of charge.  Interested parties may contact us via the Rconvert.com website or by calling (617)-500-0093.


1 Comment

R for Electronic Medical Record (EMR / EHR) Analysis and Data Mining – Will R Take The Lead?

Healthcare organizations are on the move, working feverishly to implement Electronic Medical Record (EMR) and Electronic Health Record (EHR) systems as part of a federal “requirement” enacted by the American Recovery and Reinvestment Act of 2009.  This requirement forces healthcare organizations to implement and make effective use of electronic medical record systems by 2015, or risk having Medicare reimbursements reduced.  In the rush to implement such systems, little attention has been focused on what may be the greatest contribution to the healthcare field of our time – analysis and data mining of such medical records to detect, better treat, and ultimately prevent illness.  We believe that R, an open-source data analysis language, is best positioned to make such analysis possible.

In fact, we predict that electronic medical record vendors will soon be embedding or otherwise implementing R into their solutions.

While the benefits of electronic medical records versus a paper-based alternative have long been documented, fewer than 50% of US health organizations had adopted such technology by 2009.  Cost and a lack of standards have been two of the major reasons for the delay in adoption.  With the federal government creating financial incentives to ensure that such technology is adopted, we believe that costs and standards will no longer represent significant barriers to entry moving forward.  Electronic medical records will be adopted.  But then what…

Data mining and analysis of electronic medical record data is the next frontier.

While the frenzy to implement EMR has led to so much attention being paid to the practice of storing medical records, little time has been spent determining how such data can be fully leveraged to improve patient care and health systems as a whole.  In a previous article, we discussed some of the implications of electronic medical records in the insurance space.  We concluded that it would initially increase malpractice costs for physicians.  Medical mistakes are easier to catch when more information is being recorded about a patient’s treatment.  However, we ultimately find that adoption of such systems will have an enormous impact on improving patient care beyond the current paper vs. digital benefits boasted by EMR vendors.

We see the major benefits originating from data mining of EMR records.  For example, what if medical records were analyzed in real-time to create more personalized medicine?  What if we could quickly measure how patients of a similar background responded to various treatment options, then use that information to help treat a current patient?  What about predicting length of hospital stay using medical record information, enabling hospitals to staff and allocate resources more effectively?

We view R, an open-source data analysis language as being positioned to make this vision a reality.  We believe R is best positioned to analyze electronic medical records for the following eight reasons:

  1. Given that standards for EMR systems are still in flux, any solution to data mining of such records should be flexible and capable of adapting to shifting EMR standards. R is positioned well for this environment, as it already integrates and connects into a plethora of database management systems.
  2. The technology will need to be capable of analyzing very large amounts of data – millions to billions of records.  R enables parallel processing and can be used in conjunction with Hadoop and other technologies to spread analysis out to distributed hardware.
  3. As the EMR space is a rapidly growing field, the analytical technology that it’s paired with should also be on a growth trajectory.  Given R is open-source, new methods and techniques are implemented into R faster than proprietary alternatives.
  4. The analytical technology should work on many different operating systems in order to service the variety of hardware/software solutions used by healthcare organizations.  R fulfills this requirement and is cross-platform.  It works on Windows, Mac, and Unix.
  5. The analytical technology should have a large user base to support the needs of the healthcare space.  R has a large, international community that includes some of the brightest minds.  R is also taught in most of the top academic statistical programs across the US.
  6. The technology must be transparent.  Once again, R is open-source, enabling anyone to go in and understand what it is doing.  Also, R is very well-documented in the literature.
  7. The technology must have very strong support for unstructured data analysis, as much of EMR data is unstructured text.  R has a list of very powerful text mining and unstructured data analysis packages / libraries.
  8. The technology needs to be affordable. R satisfies this requirement;  R is free.

Similar to the inevitability of EMR adoption by mainstream US healthcare, we view data mining of such records as the next surge.  The question is, who will be the leader in this space?  We believe it will be R.

For those interested in discussing this topic further, contact Timothy D’Auria at tdauria@bostondecision.com.


12 Reasons Users are Switching from SAS to R

According to a 2010 poll conducted by KDNuggets:

49.6% of current SAS users surveyed are considering a switch away from SAS .

Of the above, 32.8% are considering a switch to R.

Why switch to R?  Here are just a few reasons:

1. It’s completely free.  No licensing fees.  Ever!

2. Can handle big data (> gigabytes of data & millions of records).

3. Fantastic Parallel Processing

4. R is cross-platform (Windows, Mac, Unix, etc..)

5. Database Integration (Oracle, MySQL, SQLite, Access, Postgre SQL, Microsoft SQL Server, etc..)

6. Top companies like Google, Facebook, Pfizer use it.

7. R has more cutting-edge analytical approaches than any other language out there.

8. R is open-source.  Change it.  Deploy it to the web.  Your imagination is the limit.

9. Very large community – Get help fast.  Nearly all major universities have started teaching R in their classrooms.

10. R is stable.  Its predecessor, the S language, dates back to Bell Labs.

11. R is growing an breakneck speeds – if you can dream it, someone is probably writing it in R.

12. R is the leading data mining tool used by 43% of data miners according to the 2010 Rexer’s Annual Data Mining Survey.

BostonDecision.com (and our Rconvert.com division) has specific expertise in performing SAS to R conversions.  We can help guide interested firms through the pros, cons, risks, and benefits of converting.

1 Comment

Reading data from URL – SAS vs R

Have you ever wanted to read data directly from a web site without needing to download it to an intermediate file for import.  Below are approaches that may be used to read data directly from a URL.


handle <- url("http://datasite.com")
dta <- readLines(handle)


filename handle url 'http://datasite.com';
proc import datafile=handle out=dta;

5 Ways to Convert SAS Data to R

The process of importing SAS datasets into R is fairly simple.  As with most R and SAS tasks, there are multiple approaches to achieving the outcome, and this list is by no means exhaustive.  Below we outline five methods to transfer data between SAS and R.

Option 1: The automatic approach using read.ssd()

By entering the data directory, file to import, and SAS executable into the R read.ssd function, import can be accomplished automatically through R.

tbl <- read.ssd("C:/data", "datafile",
          sascmd="C:/Program Files/SAS Institute/SAS/V9/sas.exe")

Option 2: The automatic approach using sas.get()

The Hmisc package has a similar function called sas.get.  This function also uses SAS in the background to help convert data.  We’ve found sas.get to be a bit more tricky to use versus the other methods since it requires that the SAS executable be accessible by entering “sas” on the command line.

tbl <- sas.get("C:/data", "datafile")

Option 3: The transport-format approach using read.xport

Similar to transferring any SAS dataset between machines, this approach requires that the SAS datasets should be converted into transport format from within SAS.  Next, a simple R command is used to import the SAS files.

tbl <- read.xport("sasfile.xport")

Option 4: The XLS approach

Since MS Excel files are so common for small data storage, using them as an intermediate file format for data transfer may be a reasonable approach.  Here is how you would import an xls file into R.

tbl <- read.xls("datafile.xls", sheet=1)

Option 5: The delimited file approach

If all else fails, using comma, pipe, or tab delimited files as intermediaries is always an option.  Below is the code to import a comma-separated file into R.

tbl <- read.csv("C:/data/datafile.csv")

Entering Data in SAS vs R

Both SAS and R offer programmatic and GUI-based approaches for entering data.  In SAS, a spreadsheet-like tool may be accessed for data entry by selecting “viewtable” from the Tools menu.  Alternatively, SAS users working through Enterprise Guide may double-click on a table to reveal a spreadsheet for data entry.  In R, the data.entry() function may be ran on an object to reveal a similar spreadsheet.  Below are approaches for programmatic data entry.  It should be noted that the below examples are not exhaustive and that other methods exist for data entry in both languages.


x <- c(1.1, 2.1, 3.1)


data x;
input var1;

CSV File Import – SAS vs R

Below are methods for importing a csv file into both R and SAS.  The csv file to import is named “file.csv” and the data, once imported, is called tbl.  In SAS, there are two main approaches to importing a csv file.  The two methods are proc import and a data step.


tbl <- read.csv("C:/data/file.csv")


/* PROC Import Approach */

proc import datafile="C:\data\file.csv"

/* Data Step Approach */

data tbl;
    infile "C:\data\file.csv"
    input var1 var2 ... varn;