DEV Community

Durga Pokharel
Durga Pokharel

Posted on • Originally published at iamdurga.github.io on

Getting Started with R Programming Language.

Getting Started With R .

Introduction

R is a programming language and software environment for statical computing and graphics supported by the R foundation. R is not like a general-purpose programming language like Java, C, because it was created by statisticians as an active environment. Interactivity is the critical characteristic that allows R to explore our data. It is also a programming language and development environment for statistical testing and graphical testing. Each statistical testing is either linear, non-linear modeling, classification or many more. Different types of the plot are required while doing data analysis. In order to run R, we will use IDE(according to Wikipedia an integrated development environment(IDE) is a software application that provides comprehensive facilities to the programmer for software development). The core component which is required for every R program is BaseR. These core components contain only the code importing bits that run our code successfully.

History About R

Bell labs develops s language in 1976. In 1993 Ross Lhaka and Robert Gentleman created R in New-Zealand. R became a free source in 1995. R version 1.0.0 is released in 2000 to the public. IDE Rstdio is release in 2011.

Drawback

  • R is build by using S. If we want to build apps R probabily one be our choice.
  • The object that we work must be strored in memory and working with fetch data set can queckly

Installing and Setting up R in your Windows

Step 1: Downloading installation file

  • Download R tools from Official Website
  • Next, we need to have an IDE, most popular one is Rstudio. We can download it from this link.

After downloading installation file, install them on desired places and then open the console.

After installation completed open R then we get window just like below

img

Now we can write our R codes within console or we can do it via Rstudio.

I prefer to use Jupyter Notebook for runing R because it is more friendly for me. A good tutorial is available at Anaconda’s Documentation.

My First R program

I am assigning variable in R as my first R programs.

Assigning Variable and operator in R

A Variable is a container that stores values. An assignment statement set or reset the value store in the storage location(s) denoted by variable name(by Wikipedia). The assignment operator is a command that is it telling the computer to assign the text apple to the variable product. we can also assign by assign('products', ' apple). We can assign the variable in R in many ways like below.

Way 1

('apple'-> product)

Enter fullscreen mode Exit fullscreen mode

Way 2

(product = 'apple')

Enter fullscreen mode Exit fullscreen mode

Way 3

assign('products', ' apple)

Enter fullscreen mode Exit fullscreen mode

Logical Operators in R

Logical operator means those which gives True and False value. For example

Example 1

apple <- 2
banana <- 3
most_expensive <- banana> apple
most_expensive

Enter fullscreen mode Exit fullscreen mode

Output of above code is,

TRUE

Enter fullscreen mode Exit fullscreen mode

Example 2

apple <- 2
banana <- 3
most_expensive <- banana< apple
most_expensive

Enter fullscreen mode Exit fullscreen mode

Output of above code is,

FALSE

Enter fullscreen mode Exit fullscreen mode

Example 3

apple <- 2
banana <- 2
most_expensive <- banana == apple
most_expensive

Enter fullscreen mode Exit fullscreen mode

Output is,

TRUE

Enter fullscreen mode Exit fullscreen mode

Example 4

apple <- 2
banana <- 2
most_expensive <- banana != apple
most_expensive

Enter fullscreen mode Exit fullscreen mode

Output is,

FALSE

Enter fullscreen mode Exit fullscreen mode

Some Commonly Used Data Types in R

Data is centre for analysis if there is no data there is no analysis. Every piece of data are working with some characteristics thses characteristics can be summarize with data type.

  • Character : Anything inside quotation is a character.
  • Number: Number in R is double. Working with whole and fraction is a unique feature of double. Another is integer.
  • Integer Integer is actually simplified version of double. It store data as a string we must use capital letter L. In our use we need to use double rather than integer.
  • Logical(Boolean):Yes or No. Also T or F.
  • Complex Number: (2 + 6i)
  • Raw: It is not so popular data type. It is not easy to create variable of raw type. If we really need to create raw function as a result of calling this function we get raw type data.

All the fundamental data types are called atomic data type.

Example of numbers

An integer:

a <- 2L
class(a)

Enter fullscreen mode Exit fullscreen mode

Output is,

'integer'

Enter fullscreen mode Exit fullscreen mode

A numeric:

a <- 2
class(a)

Enter fullscreen mode Exit fullscreen mode

Output is,

'numeric'


quantity <- 2
typeof(quantity)

Enter fullscreen mode Exit fullscreen mode

Output is,

'double'


quantity_integer <- 2L
typeof(quantity_integer)

Enter fullscreen mode Exit fullscreen mode

Output is,

'integer'

Enter fullscreen mode Exit fullscreen mode

Comments

Comments are used to give important information about the code. Comments are not run by the program but a programmer writes it for better explanation of the code.

# This is a comment in R

Enter fullscreen mode Exit fullscreen mode

Exploring vectors and factors

Data structure as name suggest represent way to organize data to facilate different operations to perform faster calculations.

  • Vectors: Collection of data of same structure.
  • Factors: Which are used to store categorical data.
  • Array: Is a matrix which are generalization of vectors.
  • List\DataFrame: Elements of different list are dataframe. List are more complex data structure because they allow us to store other list too. We can think data frame as spreadsheets where data are organize as columns and rows where each column has specific data type. Within a data frame we have all kinds of datatype but within one column we have only one data type. Other criteria to categorize our data is by dimensional.

Vector and list are one dimensional objects. Matrices and dataframe are two dimensional data structure. Array are the object that have more than two dimensions.

Vector have two properties they are one dimensional and containing element of same type.

Assigning a column vector

Lets assign a column vector,

assign('b',c(1,2,3,4))
print(b)

Enter fullscreen mode Exit fullscreen mode

Output is,

1 2 3 4

Enter fullscreen mode Exit fullscreen mode

Vectors attributes:

  • length: It is denoted by length(a) and its meaning is number of elements.
  • Name: names(a), it allows us to add element in the list.
  • Type: typeof(a), It gives type of data.

There are six vectors types

  • Double
  • logical
  • character
  • complex
  • Raw
  • Integer
vector <- c("Durga","Puja","Ram","Hari")
vector
length(vector) # length 
names(vector)= "Sita" #names
typeof(vector) # type
vector

Enter fullscreen mode Exit fullscreen mode

Output is,

'Durga''Puja''Ram''Hari'
4
'character'
Sita'Durga'2'Puja'3'Ram'4'Hari'

Enter fullscreen mode Exit fullscreen mode

Manipulating vectors.

Manipulating of vectors consists of sorting, ordering, indexing.

  • sorting: Sort the data in some order.
  • Ordering: The order function return the index needed to get the vector sort.
  • Indexing: Selecting specifics iteam by position.
quantity <- c(1,3,2,5,6,7)
sort(quantity)
order(quantity)

Enter fullscreen mode Exit fullscreen mode

Output is,

1 2 3 5 6 7
1 3 2 4 5 6


a <- c(1,7,36,0,7,5)
a[2]
a[3:5]
a[c(2,4)]
a[c(4,7)]# it return particular element from vector
a[-2]
a[-(2:4)] # it skip the element in the vector.
a[a==1]
a[a>3]
a[a %in%c(2,4)] # it gives matching element.

Enter fullscreen mode Exit fullscreen mode

Output is,

7
36 0 7
7 0
0 <NA>
1 36 0 7 5
1 7 5
1
7 36 7 5

Enter fullscreen mode Exit fullscreen mode

Operating vector

Adding or multipling vector of different size is called recycling rule. For recycling largest vector must be multiple of small one.

c <- 1:6
d <- 1:3
c * d

Enter fullscreen mode Exit fullscreen mode

Output is,

1 4 9 4 10 18

Enter fullscreen mode Exit fullscreen mode

Sequence generation

It is used to create sequence of elements in a vector. seq() function takes length and difference between values as optional argument. In a code below, I take elements in the range 1 to 5 in the interval of 1.5.

Example:

seq(1,5,by = 1.5)

Enter fullscreen mode Exit fullscreen mode

Output is,

1 2.5 4

Enter fullscreen mode Exit fullscreen mode

Replicating elements

It is used to return the replicating element in the list in a specified times. In the following code I replicate the numbers from 1 to 6 two times. A builtin function rep() is used.

Example:

e<- rep(1:6,times = 2)
e

Enter fullscreen mode Exit fullscreen mode

Output is,

1 2 3 4 5 6 1 2 3 4 5 6

Enter fullscreen mode Exit fullscreen mode

We can replicate the same number at desirable times.

x <- rep(c(1),each = 10)
x

Enter fullscreen mode Exit fullscreen mode

Out put is,

1 1 1 1 1 1 1 1 1 1

Enter fullscreen mode Exit fullscreen mode

Scan Function

Scan function read any file into vector. It is very powerful function. In the code given below, it scan function read covid_data.csv.

f <- scan("covid data.csv", what = "Character")
f

Enter fullscreen mode Exit fullscreen mode

Out put of the above code is,

'date,totalCases,newCases,totalRecoveries,newRecoveries,totalDeaths,newDeaths' '1/23/2020,1,1,0,0,0,0' '1/24/2020,0,0,0,0,0,0' '1/25/2020,0,0,0,0,0,0' '1/26/2020,0,0,0,0,0,0' '1/27/2020,0,0,0,0,0,0' '1/28/2020,0,0,0,0,0,0' '1/29/2020,0,0,0,0,0,0' '1/30/2020,0,0,0,0,0,0' '1/31/2020,0,0,1,1,0,0' '2/1/2020,0,0,1,0,0,0' '2/2/2020,0,0,1,0,0,0' '2/3/2020,0,0,1,0,0,0' '2/4/2020,0,0,1,0,0,0' '2/5/2020,0,0,1,0,0,0' '2/6/2020,0,0,1,0,0,0' '2/7/2020,0,0,1,0,0,0' '2/8/2020,0,0,1,0,0,0' '2/9/2020,0,0,1,0,0,0' '2/10/2020,0,0,1,0,0,0' '2/11/2020,0,0,1,0,0,0' '2/12/2020,0,0,1,0,0,0' '2/13/2020,0,0,1,0,0,0' '2/14/2020,0,0,1,0,0,0' '2/15/2020,0,0,1,0,0,0' '2/16/2020,0,0,1,0,0,0' '2/17/2020,0,0,1,0,0,0' '2/18/2020,0,0,1,0,0,0' '2/19/2020,0,0,1,0,0,0' '2/20/2020,0,0,2,1,0,0' '2/21/2020,0,0,2,0,0,0' '2/22/2020,0,0,2,0,0,0' '2/23/2020,0,0,2,0,0,0' '2/24/2020,0,0,2,0,0,0' '2/25/2020,0,0,2,0,0,0' '2/26/2020,0,0,2,0,0,0' '2/27/2020,0,0,2,0,0,0' '2/28/2020,0,0,2,0,0,0' '2/29/2020,0,0,2,0,0,0' '3/1/2020,0,0,2,0,0,0' '3/2/2020,0,0,2,0,0,0' '3/3/2020,0,0,2,0,0,0' '3/4/2020,0,0,2,0,0,0' '3/5/2020,0,0,2,0,0,0' '3/6/2020,0,0,2,0,0,0' '3/7/2020,0,0,2,0,0,0' '3/8/2020,0,0,2,0,0,0' '3/9/2020,0,0,2,0,0,0' '3/10/2020,0,0,2,0,0,0' '3/11/2020,0,0,2,0,0,0' '3/12/2020,0,0,2,0,0,0' '3/13/2020,0,0,2,0,0,0' '3/14/2020,0,0,2,0,0,0' '3/15/2020,0,0,2,0,0,0' '3/16/2020,0,0,2,0,0,0' '3/17/2020,0,0,2,0,0,0' '3/18/2020,0,0,2,0,0,0' '3/19/2020,0,0,2,0,0,0' '3/20/2020,0,0,2,0,0,0' '3/21/2020,0,0,2,0,0,0' '3/22/2020,0,0,2,0,0,0' '3/23/2020,1,1,2,0,0,0' '3/24/2020,1,0,2,0,0,0' '3/25/2020,2,1,2,0,0,0' '3/26/2020,2,0,2,0,0,0' '3/27/2020,3,1,2,0,0,0' '3/28/2020,4,1,2,0,0,0' '3/29/2020,4,0,2,0,0,0' '3/30/2020,4,0,2,0,0,0' '3/31/2020,4,0,2,0,0,0' '4/1/2020,4,0,2,0,0,0' '4/2/2020,5,1,2,0,0,0' '4/3/2020,5,0,2,0,0,0' '4/4/2020,8,3,2,0,0,0' '4/5/2020,8,0,2,0,0,0' '4/6/2020,8,0,2,0,0,0' '4/7/2020,8,0,2,0,0,0' '4/8/2020,8,0,2,0,0,0' '4/9/2020,8,0,2,0,0,0' '4/10/2020,8,0,2,0,0,0' '4/11/2020,8,0,2,0,0,0' '4/12/2020,11,3,2,0,0,0' '4/13/2020,13,2,2,0,0,0' '4/14/2020,15,2,2,0,0,0' '4/15/2020,15,0,2,0,0,0' '4/16/2020,15,0,2,0,0,0' '4/17/2020,29,14,2,0,0,0' '4/18/2020,30,1,4,2,0,0' '4/19/2020,30,0,5,1,0,0' '4/20/2020,30,0,5,0,0,0' '4/21/2020,41,11,6,1,0,0' '4/22/2020,44,3,8,2,0,0' '4/23/2020,47,3,9,1,0,0' '4/24/2020,48,1,11,2,0,0' '4/25/2020,48,0,12,1,0,0' '4/26/2020,51,3,14,2,0,0' '4/27/2020,51,0,14,0,0,0' '4/28/2020,53,2,14,0,0,0' '4/29/2020,56,3,14,0,0,0' '4/30/2020,56,0,14,0,0,0' '5/1/2020,58,2,14,0,0,0' '5/2/2020,58,0,14,0,0,0' '5/3/2020,74,16,14,0,0,0' '5/4/2020,74,0,14,0,0,0' '5/5/2020,81,7,14,0,0,0' '5/6/2020,98,17,20,6,0,0' '5/7/2020,100,2,20,0,0,0' '5/8/2020,101,1,28,8,0,0' '5/9/2020,108,7,29,1,0,0' '5/10/2020,109,1,29,0,0,0' '5/11/2020,133,24,31,2,0,0' '5/12/2020,216,83,31,0,0,0' '5/13/2020,242,26,33,2,0,0' '5/14/2020,248,6,33,0,1,1' '5/15/2020,266,18,34,1,1,0' '5/16/2020,280,14,34,0,1,0' '5/17/2020,294,14,34,0,3,2' '5/18/2020,374,80,34,0,3,0' '5/19/2020,401,27,35,1,3,0' '5/20/2020,426,25,43,8,3,0' '5/21/2020,456,30,47,4,4,1' '5/22/2020,515,59,68,21,4,0' '5/23/2020,583,68,68,0,5,1' '5/24/2020,602,19,85,17,5,0' '5/25/2020,681,79,110,25,5,0' '5/26/2020,771,90,152,42,5,0' '5/27/2020,885,114,180,28,6,1' '5/28/2020,1041,156,184,4,6,0' '5/29/2020,1211,170,184,0,6,0' '5/30/2020,1400,189,188,4,7,1' '5/31/2020,1571,171,189,1,8,1' '6/1/2020,1810,239,190,1,8,0' '6/2/2020,2098,288,235,45,9,1' '6/3/2020,2299,201,238,3,11,2' '6/4/2020,2633,334,256,18,12,1' '6/5/2020,2911,278,289,33,12,0' '6/6/2020,3234,323,295,6,13,1' '6/7/2020,3447,213,340,45,13,0' '6/8/2020,3760,313,363,23,14,1' '6/9/2020,4083,323,394,31,15,1' '6/10/2020,4362,279,394,0,17,2' '6/11/2020,4612,250,394,0,17,0' '6/12/2020,5059,447,394,0,18,1' '6/13/2020,5334,275,394,0,19,1' '6/14/2020,5759,425,394,0,19,0' '6/15/2020,6210,451,1044,650,20,1' '6/16/2020,6590,380,1161,117,20,0' '6/17/2020,7176,586,1170,9,22,2' '6/18/2020,7847,671,1189,19,22,0' '6/19/2020,8273,426,1405,216,23,1' '6/20/2020,8604,331,1581,176,23,0' '6/21/2020,9025,421,1775,194,23,0' '6/22/2020,9558,533,2151,376,24,1' '6/23/2020,10098,540,2225,74,24,0' '6/24/2020,10727,629,2339,114,25,1' '6/25/2020,11161,434,2651,312,27,2' '6/26/2020,11754,593,2699,48,27,0' '6/27/2020,12308,554,2835,136,29,2' '6/28/2020,12771,463,3014,179,30,1' '6/29/2020,13247,476,3135,121,30,0' '6/30/2020,13563,316,3195,60,30,0' '7/1/2020,14045,482,4657,1462,33,3' '7/2/2020,14518,473,5321,664,33,0' '7/3/2020,15258,740,6144,823,33,0' '7/4/2020,15490,232,6416,272,34,1' '7/5/2020,15783,293,6548,132,35,1' '7/6/2020,15963,180,6812,264,35,0' '7/7/2020,16167,204,7500,688,36,1' '7/8/2020,16422,255,7753,253,36,0' '7/9/2020,16530,108,7892,139,38,2' '7/10/2020,16648,118,8012,120,39,1' '7/11/2020,16718,70,8443,431,39,0' '7/12/2020,16800,82,8590,147,39,0' '7/13/2020,16944,144,10295,1705,39,0' '7/14/2020,17060,116,10329,34,39,0' '7/15/2020,17176,116,11026,697,40,1' '7/16/2020,17343,167,11250,224,41,1' '7/17/2020,17444,101,11388,138,41,0' '7/18/2020,17501,57,11491,103,41,0' '7/19/2020,17657,156,11549,58,41,0' '7/20/2020,17843,186,11722,173,41,0' '7/21/2020,17993,150,12331,609,42,1' '7/22/2020,18093,100,12538,207,44,2' '7/23/2020,18240,147,12694,156,44,0' '7/24/2020,18373,133,12801,107,47,3' '7/25/2020,18482,109,12907,106,47,0' '7/26/2020,18612,130,12982,75,49,2' '7/27/2020,18751,139,13608,626,50,1' '7/28/2020,19062,311,13729,121,50,0' '7/29/2020,19272,210,13875,146,53,3' '7/30/2020,19546,274,14102,227,56,3' '7/31/2020,19770,224,14253,151,57,1' '8/1/2020,20085,315,14346,93,59,2' '8/2/2020,20331,246,14457,111,59,0' '8/3/2020,20749,418,14815,358,61,2' '8/4/2020,21008,259,14880,65,62,1' '8/5/2020,21389,381,15010,130,67,5' '8/6/2020,21749,360,15243,233,71,4' '8/7/2020,22213,464,15668,425,74,3' '8/8/2020,22591,378,16167,499,76,2' '8/9/2020,22971,380,16207,40,80,4' '8/10/2020,23309,338,16347,140,83,3' '8/11/2020,23947,638,16518,171,86,3' '8/12/2020,24431,484,16582,64,95,9' '8/13/2020,24956,525,16691,109,96,1' '8/14/2020,25550,594,16931,240,101,5' '8/15/2020,26018,468,17055,124,102,1' '8/16/2020,26659,641,17189,134,104,2' '8/17/2020,27240,581,17349,160,107,3' '8/18/2020,28256,1016,17434,85,114,7' '8/19/2020,28937,681,17554,120,120,6' '8/20/2020,29644,707,17818,264,126,6' '8/21/2020,30482,838,18068,250,137,11' '8/22/2020,31116,634,18204,136,146,9' '8/23/2020,31934,818,18485,281,149,3' '8/24/2020,32677,743,18660,175,157,8' '8/25/2020,33532,855,18973,313,164,7' '8/26/2020,34417,885,19358,385,175,11' '8/27/2020,35528,1111,19927,569,183,8' '8/28/2020,36455,927,20096,169,195,12' '8/29/2020,37339,884,20409,313,207,12' '8/30/2020,38560,1221,20676,267,221,14' '8/31/2020,39459,899,21264,588,228,7' '9/1/2020,40528,1069,22032,768,239,11' '9/2/2020,41648,1120,23144,1112,250,11' '9/3/2020,42876,1228,24061,917,257,7' '9/4/2020,44235,1359,25415,1354,271,14' '9/5/2020,45276,1041,26981,1566,280,9' '9/6/2020,46256,980,28795,1814,289,9' '9/7/2020,47235,979,30531,1736,300,11' '9/8/2020,48137,902,32818,2287,306,6' '9/9/2020,49218,1081,33736,918,312,6' '9/10/2020,50464,1246,35554,1818,317,5' '9/11/2020,51918,1454,36526,972,322,5' '9/12/2020,53119,1201,37378,852,336,14' '9/13/2020,54158,1039,38551,1173,345,9' '9/14/2020,55328,1170,39430,879,360,15' '9/15/2020,56787,1459,40492,1062,371,11' '9/16/2020,58326,1539,41560,1068,379,8' '9/17/2020,59572,1246,42803,1243,383,4' '9/18/2020,61592,2020,43674,871,390,7' '9/19/2020,62796,1204,45121,1447,401,11' '9/20/2020,64121,1325,46087,966,411,10' '9/21/2020,65275,1154,47092,1005,427,16' '9/22/2020,66631,1356,47915,823,429,2' '9/23/2020,67803,1172,49808,1893,436,7' '9/24/2020,69300,1497,50265,457,452,16' '9/25/2020,70613,1313,51720,1455,458,6' '9/26/2020,71820,1207,52867,1147,466,8' '9/27/2020,73393,1573,53752,885,476,10' '9/28/2020,74744,1351,54494,742,481,5' '9/29/2020,76257,1513,55225,731,491,10' '9/30/2020,77816,1559,56282,1057,498,7'

Enter fullscreen mode Exit fullscreen mode

Conversion of different type of data into character type is called implicit coercian

R convert coerced data type into character.

x <- c(1,'two',4,"durga")
x
typeof(x)

Enter fullscreen mode Exit fullscreen mode

Output is,

'1' 'two' '4' 'durga'
'character'

Enter fullscreen mode Exit fullscreen mode

Explicit type coercian

  • We do this by typing as.desire data type. Explicit type coercian helps us to deal with incorectly catagorized data.
  • We can not transfer numeric into character
  • Character into numberic.
num <- 1:5
num_char <- as.character(num)
num_char

Enter fullscreen mode Exit fullscreen mode

Output is,

'1' '2' '3' '4' '5'


product <- c("apple",1,"banana")
as.numeric(product)

Enter fullscreen mode Exit fullscreen mode

Output is,

Warning message in eval(expr, envir, enclos):
"NAs introduced by coercion"
<NA> 1 <NA>

Enter fullscreen mode Exit fullscreen mode

Installing Packages in R

There are numerous useful packages to do various tasks in R and with those packages, we could do things better and faster way. Once simpler way to install packages is via console;

install.packages("haven")


library("haven") # allows to read sav file
saq8 <- read_sav("F:/Statisticts with R/CSV file for covid data/SAQ8.sav")

Enter fullscreen mode Exit fullscreen mode

In above example, I first installed package named as haven and then I used it to read sav file.

This all for this blog and I hope you enjoyed it. Please leave the feedbacks and stay tuned for my next blog.

Top comments (0)