DEV Community

loading...

Data Structure in R programming

maxwizard01 profile image maxwizard01 ・Updated on ・7 min read

πŸ”₯ Data Structure

πŸ”₯ Introduction to R (1)

πŸ”₯ Data Structure (2)

πŸ”₯ Statistical value (mean, median, mode etc) (3)

πŸ”₯ Tabular Presentation of Data (4)

πŸ”₯ Ploting graph with R

πŸ”₯ constructing frequency distribution with R (6)

Data Structures

R also has a number of basic data structures. A data structure is either homogeneous (all elements are of the same data type) or heterogeneous (elements can be of more than one data type).
Dimension Homogeneous Heterogeneous
1 Vector List
2 Matrix Data Frame
3+ Array

Vectors

Many operations in R make heavy use of vectors. Vectors in R are indexed
starting at 1. That is what the [1] in the output is indicating, that the first element of the row being displayed is the first element of the vector.
Larger vectors will start additional rows with [*] where * is the index of the first element of the row.

the most common way to create a vector in R is using the c() function, which is short for β€œcombine.”” As the name suggests, it combines a list of elements separated by commas.

c(1, 3, 5, 7, 8, 9)
[1] 1 3 5 7 8 9

Here R simply outputs this vector. If we would like to store this vector in a variable we can do so with the assignment operator =. In this case the variable x now holds the vector we just created, and we can access the vector by typing x.
Let's quickly take a look at some examples.

Exampes1.1:

let's have the age of six people in the vector with the following codes

Age = c(11, 31, 25, 7, 8, 19)
print(Age)
Enter fullscreen mode Exit fullscreen mode

RESULT

> Age = c(11, 31, 25, 7, 8, 19)
> print(Age)
[1] 11 31 25  7  8 19
Enter fullscreen mode Exit fullscreen mode

Example2

let's say we only need to print one of the values on the list. we can easily
print it by locate its position. our formular will look like
variableName[position]
copy and study the following codes

Age = c(11, 31, 25, 7, 8, 19)
print(Age[1])
print(Age[3])
print(Age[6])
Enter fullscreen mode Exit fullscreen mode

Result

> Age = c(11, 31, 25, 7, 8, 19)
> print(Age[1])
[1] 11
> print(Age[3])
[1] 25
> print(Age[6])
[1] 19
Enter fullscreen mode Exit fullscreen mode

can you see the Result? age(1) means the first value in the age vector while Age(3) means the third value.

You should ask me is it only numbers that can be written in the vector? the answer is NO.
you can write string / characters too just make sure that you put it in quoute.

Example2

maxwizardBio=c("Oladejo Abdullahi",18,221382,"Education","UI","maths")
print(maxwizardBio[4])
print(maxwizardBio[1])
print(maxwizardBio[3])
Enter fullscreen mode Exit fullscreen mode

RESULT

> maxwizardBio=c("Oladejo Abdullahi",18,221382,"Education","UI","maths")
> print(maxwizardBio[4])
[1] "Education"
> print(maxwizardBio[1])
[1] "Oladejo Abdullahi"
> print(maxwizardBio[3])
[1] "221382"
Enter fullscreen mode Exit fullscreen mode

can you see the results, you can store string inside the vector too. Now let's see how to store
sequence inside a vector.

Example

imagine we want Number to store a vector of number from 1 to 200. it will be
too hard to start typing and separating them by comma. so instead of typing we type 1:200
try the following codes.

 number=c(1:200)
 print(number)
Enter fullscreen mode Exit fullscreen mode

Result

> number=c(1:200)
> print(number)
  [1]   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16  17  18
 [19]  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36
 [37]  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54
 [55]  55  56  57  58  59  60  61  62  63  64  65  66  67  68  69  70  71  72
 [73]  73  74  75  76  77  78  79  80  81  82  83  84  85  86  87  88  89  90
 [91]  91  92  93  94  95  96  97  98  99 100 101 102 103 104 105 106 107 108
[109] 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126
[127] 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144
[145] 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162
[163] 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180
[181] 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198
[199] 199 200
Enter fullscreen mode Exit fullscreen mode

Woow! interesting! Okay what if we need to print all even number from 1 t0 60
let's see how to write the codes

evnNumber=seq(from = 2, to = 60, by = 2)
print(evnNumber)
Enter fullscreen mode Exit fullscreen mode

Result

Number=seq(from = 0, to = 60, by = 2)
> print(evnNumber)
 [1]  2  4  6  8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48
[26] 50 52 54 56 58 60
Enter fullscreen mode Exit fullscreen mode

can you see how things work out here. you want to create a sequence of numbers.
since it is no more by increment of one. we created it using seq() function.
Notice: i start from 2 instead of 1 because that is the first even numbers.

Let's try another examples.

Example

write a program that printed all odd number from 1 to 100
copy the following codes

evnNumber=seq(from = 1, to = 100, by = 2)
print(evnNumber)
Enter fullscreen mode Exit fullscreen mode

RESULT

Number=seq(from = 1, to = 100, by = 2)
> print(evnNumber)
 [1]  1  3  5  7  9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49
[26] 51 53 55 57 59 61 63 65 67 69 71 73 75 77 79 81 83 85 87 89 91 93 95 97 99
Enter fullscreen mode Exit fullscreen mode

Now let's use this to solve Arithemetic problems.

Example

if the first term of A.P is 4 and the common different is 14 find
i) the 100th term.
ii) list the first 100 term of the series
iii) hence print the following
a) the 34th term b) the 40th term c) the 65th term.

Solution

we know the formular
Tn=a+(n-1)d for nth term.
now let's start with the following codes

a=4
d=14
n=100
T100=a+(n-1)*d
print(T100)
first100Term=seq(from=4,to=T100,by=d)
print(first100Term)
print(first100Term[34])
print(first100Term[40])
print(first100Term[65])
Enter fullscreen mode Exit fullscreen mode

Result

> d=14
> n=100
> T100=a+(n-1)*d
> print(T100)
[1] 1390
> first100Term=seq(from=4,to=T100,by=d)
> print(first100Term)
  [1]    4   18   32   46   60   74   88  102  116  130  144  158  172  186  200
 [16]  214  228  242  256  270  284  298  312  326  340  354  368  382  396  410
 [31]  424  438  452  466  480  494  508  522  536  550  564  578  592  606  620
 [46]  634  648  662  676  690  704  718  732  746  760  774  788  802  816  830
 [61]  844  858  872  886  900  914  928  942  956  970  984  998 1012 1026 1040
 [76] 1054 1068 1082 1096 1110 1124 1138 1152 1166 1180 1194 1208 1222 1236 1250
 [91] 1264 1278 1292 1306 1320 1334 1348 1362 1376 1390
> print(first100Term[34])
[1] 466
> print(first100Term[40])
[1] 550
> print(first100Term[65])
[1] 900
Enter fullscreen mode Exit fullscreen mode

Incredible! we just solve A.p with codes.
Let's quickly see another ways to create vector.
Another common operation to create a vector is rep(), which can repeat a single value a number of times.
Example
codes:

allB=rep("B", times = 10)
print(allB)
Enter fullscreen mode Exit fullscreen mode

Result

allB=rep("B", times = 10)
> print(allB)
 [1] "B" "B" "B" "B" "B" "B" "B" "B" "B" "B"
Enter fullscreen mode Exit fullscreen mode

The rep() function can be used to repeat a vector some number of times.
Example
type the following code

x=c(1,5,2,9)
y=rep(x, times = 3)
print(y)
Enter fullscreen mode Exit fullscreen mode

Result

> x=c(1,5,2,9)
> y=rep(x, times = 3)
> print(y)
 [1] 1 5 2 9 1 5 2 9 1 5 2 9
Enter fullscreen mode Exit fullscreen mode

Study the codes and their result. B has been duplicate in the number of times we provided. also x is a vector already to make it form a longer vector we duplicate it in three times.
Note: it is not important that you type times if you like you might not write it and just type like the following.
Codes

x=c(1,5,2,9)
y=rep(x,3)
print(y)
Enter fullscreen mode Exit fullscreen mode

run the code you should get the same thing as the previous result. the same thing applicable to seq() that will did in the begining.

The length of a vector can be obtained with the length() function.

Example

x=c(3,5,2,4,1,0)
y=1:100
z=seq(3,40,5)
a=rep(x,4)
length(x)
length(y)
length(z)
length(a)
Enter fullscreen mode Exit fullscreen mode

Result

> x=c(3,5,2,4,1,0)
> y=1:100
> z=seq(3,40,5)
> a=rep(x,4)
> length(x)
[1] 6
> length(y)
[1] 100
> length(z)
[1] 8
> length(a)
[1] 24
Enter fullscreen mode Exit fullscreen mode

Neow We have seen four different ways to create vectors:
β€’ c()
β€’ :
β€’ seq()
β€’ rep()
use any one of the above to create a vector.

Vectorizations

One of the biggest strengths of R is its use of vectorized operations.i.e you can perform different kind of operation on your vector.

Example

codes:

number = 2:10
print(number + 4)
print(2 * number)
print(2 ^ number)
print(sqrt(number)) # this is used to find the square root
print(log(number)) # to find the logarithm of each.
Enter fullscreen mode Exit fullscreen mode

Result

> number = 2:10
> print(number + 4)
[1]  6  7  8  9 10 11 12 13 14
> print(2 * number)
[1]  4  6  8 10 12 14 16 18 20
> print(2 ^ number)
[1]    4    8   16   32   64  128  256  512 1024
> print(sqrt(number)) # this is used to find the square root
[1] 1.414214 1.732051 2.000000 2.236068 2.449490 2.645751 2.828427 3.000000 3.162278
> print(log(number)) # to find the logarithm of each.
[1] 0.6931472 1.0986123 1.3862944 1.6094379 1.7917595 1.9459101 2.0794415 2.1972246 2.3025851
Enter fullscreen mode Exit fullscreen mode

Now I believe that you have learned so much things about vector and how to make use of them and create them.

πŸ”₯ Introduction to R (1)

πŸ”₯ Data Structure (2)

πŸ”₯ Statistical value (mean, median, mode etc) (3)

πŸ”₯ Tabular Presentation of Data (4)

πŸ”₯ Ploting graph with R

πŸ”₯ constructing frequency distribution with R (6)

Discussion (0)

Forem Open with the Forem app