needythings: March 2015

Monday, March 23, 2015

Sorting in R Language

##sorting

D <- data.frame(x =c(1, 2, 3, 4), y = c(4, 5, 6, 7))
D

##sorting
indexes <- order(D$x)
D[indexes,]

##reverse sorting
D[rev(order(D$y)),]

vectors, list, matrix, data frames in R language

##vector

x <- c(2, 5, 6, 1)
y <- c(5, 4, 8, 9)
year <- 1990:1993
names <- c("a", "b", "c", "d")

##postion of element
y[1]
y[length(y)]

##list

person <- list(names = "a", x = 2, y = 9, year = 1990)
person

##list data extraction
person$names
person$x

##column bind
cbind(year, x, y)

##Data Frame
D <- data.frame(names, year, x, y)
nrow(D)

##inside Data frame
D$names
D$names[nrow(D)]

Saturday, March 21, 2015

Basic Assignment For R Programming

Basic Assignment For Programming and useful to coursera assignment1

dataset_url <- "http://s3.amazonaws.com/practice_assignment/diet_data.zip"
download.file(dataset_url, "diet_data.zip")
unzip("diet_data.zip", exdir = "diet_data")
list.files("diet_data")

##Andy

andy <- read.csv("diet_data/Andy.csv")
head(andy)
length(andy$Day)
dim(andy)
str(andy)
summary(andy)
names(andy)
andy[1, "Weight"]
andy[30, "Weight"]
andy[which(andy[,"Day"] == 30), "Weight"]
subset(andy$Weight, andy$Day == 30)
andy_start <- andy[1, "Weight"]
andy_end <- andy[30, "Weight"]
andy_loss <- andy_start - andy_end
andy_loss

##every body at once

files <- list.files("diet_data")
files

files_full <- list.files("diet_data", full.names = T)
files_full

head(read.csv(files_full[3]))

andy_david <- rbind(andy, read.csv(files_full[2]))
head(andy_david)
tail(andy_david)
day_25 <- andy_david[which(andy_david$Day == 25), ]
day_25

dat <- data.frame()
for (i in 1:5) {
dat <- rbind(dat, read.csv(files_full[i]))
}

str(dat)

for (i in 1:5) {
dat2 <- data.frame()
dat2 <- rbind(dat2, read.csv(files_full[i]))
}

str(dat2)
head(dat2)

median(dat$Weight, na.rm = T)
dat_30 <- dat[which(dat[, "Day"] == 30),]
dat_30
median(dat_30$Weight)

## The Real One

weightmedian <- function(directory, day) {
file_list <- list.files(directory, full.name = T)
dat <- data.frame()
for(i in 1:5) {
dat <- rbind(dat, read.csv(file_list[i]))
}
dat_subset <- dat[which(dat[, "Day"] == day), ]
median(dat_subset[, "Weight"], na.rm =T)
}

weightmedian(directory = "diet_data", day = 20)
weightmedian("diet_data", 4)
weightmedian("diet_data", 17)

summary(files_full)

##empty list
tmp <- vector(mode = "list", length = length(files_full))
tmp

for (i in seq_along(files_full)) {
tmp[[i]] <- read.csv(files_full[[i]])
}
str(tmp)

str(tmp[[1]])

##do.call() to combine tmp into a single data frame

output <- do.call(rbind, tmp)
str(output)

Wednesday, March 11, 2015

Facts of Sanskrit Language

If you believe the language of the old-fashioned many developed countries that are engaged in learning Sanskrit. Come see why: amazing facts about Sanskrit!
1. The best language for computer use. Reference: - Forbes 1987.
2. The calendar, which is being used is the best kind, the Vikram Era calendar (the new year begins with geological changes of the solar system) Reference: German State University.
3. drug that is the most useful language to speak in Sanskrit ... the person healthy and BP, Mdhumah, will be free of disease, such as cholesterol. The nervous system of the human body in Sanskrit thing remains active so that the person's body positively charged (Positive Charges) becomes active with Halsndrb: American Hindu University (after research).
4. Sanskrit is the language of your books Vedas, Upanishads, Shruti, memory, Puranas, Mahabharata,
Ramayana etc. The advanced technology (Technology) holds Halsndrb: Russian
State University.
5lnasa the 60,000 palm leaf manuscripts which they use study
Have been to. Unverified reports say that the Russian, German, Japanese, American actively
Our holy books are researching new things and taking them back to the world by its name. !
6. the world in 17 countries, the study of Sanskrit and Sanskrit University one or more to get a new technology, but also to study Sanskrit Sanskrit University dedicated its rightful
India (India) is not.
7. Sanskrit the mother of all languages of the world. All Languages (97%) is directly or indirectly affected by the language. Reference: - UNO
8. A report by NASA scientist of the US 6th and 7th generation supercomputer
Sanskrit language has been based on the super computer can be used to its maximum extent. The project deadline in 2025 (6th generation) and 2034 (7th generation), and then around the world to learn Sanskrit, a language would be revolution.
9. The best in the world available for the purpose of translating the Sanskrit language. Reference: Forbes 1985.
10. Currently Sanskrit language "advanced Kirlian photography" technique is being used. (Currently, advanced Kirlian photography techniques exist only in Russia and the United States. India Today "simple Kirlian
Photography "is not)
11. United States, Russia, Sweden, Germany, Britain, France, Japan and Austria currently Bharatanatyam and doing research about the importance of Nataraja. (Nataraja, the cosmic dance of Shiva. Shiva in front of the UN office in Geneva
Or a statue of Nataraja).
1 2. Britain currently researching our Mr. cycle is based on a defense system |
Sanskrit to English is not the time to come, learn and teach it, the country rapidly increasing. -
13 America's largest organization NASA (National Aeronautics and Space Administration) in the Sanskrit language, any message to send to the space is considered the most useful language! Space travelers, according to NASA scientists when he sent the message to his sentence were reversed. The meaning of the message was changed. He used the language of the world, but every time I got the same problem. Finally they sent a message in Sanskrit Sanskrit sentences are reversed because not even change their meaning. This is interesting information in a recent ceremony, director of the Delhi government establishment oriental studies. Jitram Bhatt said.

Tuesday, March 10, 2015

Big Data Is Not Data Warehousing: Martyn Jones

Big Data is not Data Warehousing, it is not the evolution of Data Warehousing and it is not a sensible and coherent alternative to Data Warehousing. No matter what certain vendors will put in their marketing brochures or stick up their noses.

In spite of all of the high-visibility screw-ups that have carried the name of Data Warehousing, even when they were not Data Warehouse projects at all, the definition, strategy, benefits and success stories of data warehousing are known, they are in the public domain and they are tangible.

Data Warehousing is a practical, rational and coherent way of providing information needed for strategic and tactical option-formulation and decision-making.

Data Warehousing is a strategy driven, business oriented and technology based business process.

We stock Data Warehouses with data that, in one way or another, comes from internal and optional external sources, and from structured and optional unstructured data. The process of getting data from a data source to the target Data Warehouse, involves extraction, scrubbing, transformation and loading, ETL for short.

Data Warehousing’s defining characteristics are:

Subject Oriented: Operational databases, such as order processing and payroll databases and ERP databases, are organized around business processes or functional areas. These databases grew out of the applications they served. Thus, the data was relative to the order processing application or the payroll application. Data on a particular subject, such as products or employees, was maintained separately (and usually inconsistently) in a number of different databases. In contrast, a data warehouse is organized around subjects. This subject orientation presents the data in a much easier-to-understand format for end users and non-IT business analysts.

Integrated: Integration of data within a warehouse is accomplished by making the data consistent in format, naming and other aspects. Operational databases, for historic reasons, often have major inconsistencies in data representation. For example, a set of operational databases may represent “male” and “female” by using codes such as “m” and “f”, by “1” and “2”, or by “b” and “g”. Often, the inconsistencies are more complex and subtle. In a Data Warehouse, on the other hand, data is always maintained in a consistent fashion.

Time Variant: Data warehouses are time variant in the sense that they maintain both historical and (nearly) current data. Operational databases, in contrast, contain only the most current, up-to-date data values. Furthermore, they generally maintain this information for no more than a year (and often much less). In contrast, data warehouses contain data that is generally loaded from the operational databases daily, weekly, or monthly, which is then typically maintained for a period of 3 to 10 years. This is a major difference between the two types of environments.

Historical information is of high importance to decision makers, who often want to understand trends and relationships between data. For example, the product manager for a Liquefied Natural Gas soda drink may want to see the relationship between coupon promotions and sales. This is information that is almost impossible – and certainly in most cases not cost effective – to determine with an operational database.

Non-Volatile: Non-volatility means that after the data warehouse is loaded there are no changes, inserts, or deletes performed against the informational database. The Data Warehouse is, of course, first loaded with cleaned, integrated and transformed data that originated in the operational databases.

We build Data Warehouses iteratively, a piece or two at a time, and each iteration is primarily a result of business requirements, and not technological considerations.

Each iteration of a Data Warehouse is well bound and understood – small enough to be deliverable in a short iteration, and large enough to be significant.

Conversely, Big Data is characterised as being about:

Massive volumes: so great are they that mainstream relational products and technologies such as Oracle, DB2 and Teradata just can’t hack it, and

High variety: not only structured data, but also the whole range of digital data, and

High velocity: the speed at which data is generated, transmitted and received.

These are known as the three Vs of Big Data, and they are subject to significant and debilitating contradictions, even amongst the gurus of Big Data (as I have commented elsewhere: Contradictions of Big Data).

From time to time, Big Data pundits slam Data Warehousing for not being able to cope with the Big Data type hacking that they are apparently used to carrying out, but this is a mistake of those who fail to recognise a false Data Warehouse when they see one.

So let’s call these false flag Data Warehouse projects something else, such as Data Doghouses.

“Data Doghouse, meet Pig Data.”

Failed or failing Data Doghouses fail for the same reasons that Big Data projects will frequently fail. Both will almost invariably fail to deliver artefacts on time and to expectations; there will be failures to deliver value or even simply to return a break even in costs versus benefits; and of course, there will be failures to deliver any recognisable insight.

Failure happens in Data Doghousing (and quite possibly in Big Data as well) because there is a lack of coherent and cohesive arguments for embarking on such endeavours in the first place; a lack of real business drivers; and, a lack of sense and sensibility.

There is also a willing tendency to ignore the advice of people who warn against joining in the Big Data hubris. Why do some many ignore the ulterior motives of interested parties who are solely engaged in riding on the faddish Big Data bandwagon to maximise the revenue they can milk off punters? Why do we entertain pundits and charlatans who ‘big up’ Big Data whilst simultaneously cultivating an ignorance of data architecture, data management and business realities?

Some people say that the main difference between Big Data and Data Warehousing is that Big Data is technology, and Data Warehousing is architecture.

Now, whilst I totally respect the views of the father of Data Warehousing himself, I also think that he was being far too kind to the Big Data technology camp. However, of course, that is Bill’s choice.

Let me put it this way, if Oracle gave me the code for Oracle 3, I could add 256 bit support, parallel processing and give it an interface makeover, and it would be 1000 times better than any Big Data technology currently in the market (and that version of Oracle is from about 1983).

Therefore, Data Warehousing has no serious competing paragon. Data Warehousing is a real architecture, it has real process methodologies, it is tried and proven, it has success stories that are no secrets, and these stories include details of data, applications and the names of the companies and people involved, and we can point at tangible benefits realised. It’s clear, it’s simple and it’s transparent.

Smart Data Collective

Sunday, March 8, 2015

Why Do We Fail?

Your Ideals are Messed Up.

Sometimes, to be successful, you have be flipping off some of your ideals. "I won't work with disrespectful subordinates", "I won't heed to bossy authorities" - are some thought processes which might not get you too far.Sometimes, you have to suck it up. If you want to be treated like a king from day one, chances are you'll be stabbed in the main market even while you run your campaign.
You Goals are Shaky.

Now you want to be a CEO of a startup, and you start to lay out your plan the next morning. Great idea. Go for it. But, you encounter a hurdle with the financing a day later, and you let your project go down the drain. Suddenly, next week you want to become a stockbroker. No, no, no, no, no. Not that easy, you see. You've got to face the fact that you are not omnipotent.Choose a goal and stick to it. No matter how bad it may seem. Goal switching is one of the major causes of failure.
Choose Wisely.

For those who get the Indiana Jones reference, "you're the man". Those who haven't, first go Google the reference, then come back. You need to see the underlying purpose of your dreams. Suppose you dream about steak one day, the next day, you dream about tofu, then burgers, then salmon. Doesn't mean you want to be meat-lover one day, then a vegan, then a junkie, and finally a cat. It means you are hungry and you want food. So, choose the one best suited to you. Most goals you'll have in your life will be different forms of "fame" and "money". So, follow the way which seems most feasible to you. Break down your dream into simpler components. You'll be a lot more healthier and less confused.
There is a word called "Farfetched".

Call "bullcrap" to those "you can be anyone" selling people. They are selling you that probably because of one of the following reasons -

1. They want to sell you something which will help you "be anyone".
2. They have recently come across some great success story or quotation, and they want to share it with you.
3. They are your parents and they just want to see you happy.

I am not saying you have to limit yourself, but, it's good to ascertain the scalability of your dreams from time to time. Heck, I want to land a job at Google, so, what do I do? Do, I directly go apply to their head offices and get shot down? No, I try my hand at small programming challenges, raise my level slowly. Before trying to fight off a lion, it's a preferable idea to see if your cat listens to you.
You dream too much.

So, when do you dream too much? Right, when you sleep too much. Today, you want a startup, you don't have an idea, but, you are attracted by the seemingly lucrative life and fame of a successful startup. Tomorrow you want to be a professional hacker. You can't code worth crap, but, you are mesmerized by how cool it sounds on television. These are not failures, this is just you waking up from a day dream and realizing, "Fuck! I slept through the class again".
You have embraced Failure.

Remarkably enough, there are is a class of people who are exactly opposite to the kind mentioned in the "farfetched" point. The "mood-breakers". They will join the team with an attitude like, "What's the point of all this?". Or worse still, the "Fake Tyler Durden"s, who are all about the place with, "This is corporate greed.", "You are not defined by your job or your money". Fact is - you are. I'd like to see one person who keeps worshipping that brilliant movie (and missing it's actual message) live one night, just one night in a place like the one Tyler used to have.
Your Resources are "Questionable".

Little story from back in the days. I had this friend who used to study his ass off all throughout the year. Attend every class, ass-kiss every teacher who needed ass-kissing, jotted down every word, and prepared extensive notes. I had this other friend, who used to borrow this guy's notes, and study for a couple of days and score much more than him. The second guy's secret. He knew what to study. I remember my school mathematics teacher tell me, "Work hard. But, like a horse, not a donkey." So, before you read a thousand books on a subject, ask yourself, how much of this will actually be required to pursue the end goal I seek.
The Final Line.

Now that we have taken care of all that. You are setting a validated goal, you are determined on it, you work hard for it, and even then you failed? Luck. You can win when luck is not with you, you can win when luck is with you, but, you can't win when luck decides to screw you over, and over and over again. You'll just have to keep trying and hope luck's patience runs out earlier than yours.

Believe me when I say, I know a lot about failures. I am 21, and I've failed at more things than most people even try ever in their lifetime. So, I hope you become more successful than me. Just kidding, I don't.

Tuesday, March 3, 2015

Why do most vegetarians in India dislike non-vegetarian food without even tasting it?

Because I can't stand the suffering that the animal has gone through before reaching the chef's hands. His neck has been broken. His stomach has been cut open. His guts have been poured out. His eyes have been plucked out. His blood has been drained. Then the flesh is cut with a knife.

I can die but can never eat non-vegetarian for this reason. Can't bear to see the pain of the animals. Can't imagine the struggle of the animal when the butcher or a machine is cutting it. He is struggling to survive and we are willfully killing it. How can I eat that and support such slaughter? Just can't.

Forget taste, even if it gives me longer life, I won't eat it. Even if it makes me 10 times stronger or 10 times more intelligent, I still won't eat it. The question of taste never even enters my mind after seeing meat. All I see is the horror of a dead animal.

Most people have some kind of boundary with respect to what they will or won't eat, even non-vegetarians. For example, most non-vegetarian Hindus don't eat beef because it's taboo for religious and cultural reasons. Most American non-vegetarians will eat beef, but not dogs--again, because it's taboo to eat "man's best friend." Other societies may consume dogs, but draw lines elsewhere.

Are you an American? Have you ever eaten dog, horse, dolphin? Jellyfish, cockroaches, haggis? If not, does the thought of consuming any of those things seem repugnant, whether for moral or environmental reasons, or general grossness? Almost everyone has something they won't eat. At the very least, consuming human flesh is taboo in almost every culture, even if most of us have never tasted it.

Everyone draws lines. Vegetarians just happen to draw their lines to proscribeall consumption of meat which, as a lifestyle choice, is actually more consistent than avoiding dog, beef, or cockroach in particular. I'm a vegetarian and an adult capable of thinking and making choices for myself. I don't eat meat for ethical and environmental reasons, and I don't think tasting it would ever change my mind.

Sunday, March 1, 2015

Generalized, linear, and generalized least squares models (LM, GLM, GLS) using R Language

#lm linear regression, normal error, constant variance
Y = a + bX + E a Linear Predictor
#glm generalize linear model, non-normal error, non-constant variance
LogY = a + bX + E
Y = e^a*e^bX + E a Multiplicater, exponential and Logthermic
in glm, individual slope gives an estimate of multiplitive change
in the reponse variable for one unit change in corresponding explanatory variable
#gls: generalise least square model, correlated error, spatial, temperal/pattern/trends

airquality

plot(Ozone~Wind,airquality)

model1=lm(Ozone~Wind,airquality)
plot(model1)
coef(model1)

#prediction for Wind speed at 19 and 20 mph

coef(model1)[1]
Ozone1=coef(model1)[1]+coef(model1)[2]*19
Ozone2=coef(model1)[1]+coef(model1)[2]*20

Ozone1
Ozone2

##poisson is generalize linear model

model2=glm(Ozone~Wind,airquality,family=poisson)
glm(model2)

# Coefficients:
#(Intercept) Wind
# 96.873 -5.551

Ozone1.glm=exp(coef(model2)[1]+coef(model2)[2])*19
Ozone2.glm=exp(coef(model2)[1]+coef(model2)[2])*20
Ozone1.glm
Ozone2.glm

plot(Ozone~Wind,airquality)

Ozone1.glm/Ozone2.glm

# 0.95

exp(coef(model2))[2] #exp(-5.551)

###gls

library(nlme)

model3.gls=gls(Ozone~Wind,airquality)

model3=gls(Ozone~Wind,airquality,na.action=na.exclude)
head(airquality)
?airquality

paste(1973,airquality$Month,airquality$Day, sep=",")
as.Date(paste(1973,airquality$Month,airquality$Day, sep=","))

airquality$Date
paste(1973,airquality$Month,airquality$Day, sep=",")

library(lattice)
xyplot(Ozone~Date,airquality)

model4=gls(Ozone~Wind*Date,airquality,na.action=na.exclude)
air2=subset(airquality, complete.cases(Ozone))

model4=gls(Ozone~Wind*Date,air2)
plot(ACF(model5=~Date),alpha=0.5)

model6=(update(model5,correlation=corAR1())

library(MuMIn)

AICc(model5,model6)

summary(model6)