You are here

Some useful bar plots using R

In this article I am trying to show how to produce bar plots using R. Many of my friends think SPSS is the most useful software for producing plots and they keep using it (some of them even use Excel!).
My goal is to show that R can do every type of graphs that other commercial softwares can do. In fact it does much better than the simple point and click packages, as R gives us much better control over our analysis.

The data of my concern is -

   sex income   district
female     21      dhaka
  male     11      dhaka
  male     43 chittagong
female     22      dhaka
  male     56    barisal
female     23    barisal
female     66      dhaka
  male     76      dhaka
female     11 chittagong
female     89      dhaka

This data is not a real data, completely created by me just to do experiments using R codes.

Now, I want to produce a bar plot where 'sex' would be the category axis and the clusters will represent mean and median 'income', i.e. I want to produce a plot that we produce in SPSS by the command-

graph
/bar=mean(income) median(income) by sex.

So, at first I calculate mean and median 'income' for both male and female.

m1<-tapply(data\$income,data\$sex,mean)
m2<-tapply(data\$income,data\$sex,median)
r<-rbind(m1,m2)
b<-barplot(r,col=c("green","blue"),ylim=c(0,65),beside=T)
legend("topleft",c("mean","median"),col=c("green","blue"),pch=15)

Then if I want to put the numbers represented by the bars above them,
the code will be-

text(x=b,y=c(r[1],r[2],r[3],r[4]),labels=c(round(r[1],2),
round(r[2],2),r[3],r[4]),pos=3,col="black",cex=1.25)

And the plot is-plot1.png

Now, if I want to produce a more complex plot that is a bar plot that will show mean income for all the three districts separately for male and female, i.e. the plot we produce in SPSS by the command-

graph
/bar=mean(income) by sex by district.

For the required summary statistics I used a package 'Epi' and with the following command produced a very useful summary table-
s=stat.table(list(district,sex),contents=list(mean(income)))
And the produced table is-

----------------------------- 
             -------sex------- 
 district      female    male  
 ----------------------------- 
 barisal        23.00   56.00  
 chittagong     11.00   43.00  
 dhaka          49.50   43.50  
 -----------------------------

As we all know this statistics can also be found by a few lines of codes instead of 'stat.table', but I used it just to cut down some codes.
Then, I did the following commands-

female<-c(s[1],s[2],s[3])
male<-c(s[4],s[5],s[6])
r<-cbind(female,male)
row.names(r)<-c('barisal','chittagong','dhaka')
b<-barplot(r,col=c("red","green","yellow"),beside=T,ylim=c(0,60))
legend("topleft",c("barisal","chittagong","dhaka"),col=
c("red","green","yellow"),pch=15,bty="n")
text(x=b,y=c(r[1:6]),labels=c(r[1:6]),cex=1.25,pos=3)

And the plot is-
hmmm.jpeg

Hope this codes will be useful to those who really want to do every type of statistical work(including producing graphs) in R.

Category: 

Comments

m1<-mean<-tapply(data\$income,data\$sex,mean)

I wasn't sure why did you assign the tapply output to a built-in function mean. Although R will probably mask it, one should not use any system variable/object/function as ones objects/variables.

exactly what i needed. some of the descriptive plots required for my research were too plain. needed to modify the above codes for date-and-time variables though.

Hi, juz to leave a note to think you for you codes. They are really really good. I was trying to do my project and just couldn't figure out how to do barplots to compare between genders having them next to each other. So glad i found you blog. Thanks... =)