Numbers and US

Story that numbers tell us

Archive for the ‘Academic Writing’ Category

Does Economics Violate the Laws of Physics?

leave a comment »

They say , now at this age I have realized it also, that exercise is necessary to keep body fit. Thinking along the last line, if it applies to the body, it must apply to the brain and mind.

So here is an article that says that, “Neoclassical economics is inconsistent with the laws of thermodynamics” — two areas I never thought of in same breath.


Written by SK

October 24, 2009 at 7:52 pm

Aligning Scores

leave a comment »

Past few days, IITs have been in news again. HRD minister Kapil Sibal wants to give weightage to class twelfth performance also for selection.  He wants minimum marks to be eighty percent to qualify for the test.  While this proposal has been topic of hot discussion, I kept thinking about the execution of this proposal from statistical perspective.

The burning question, we all will agree, is that not two board examination is similar in terms of  strictness in awarding marks. While, it’s easier, I mean relatively easier, to get 90% in CBSE and ICSE , it’s next to impossible to get similar marks in a few state boards, like Bihar Intermediate examination.

Now, if we want the screening rule to be unbiased, we need to account for the reality that 80% in one board examination is not 80% in another board examination. This can be done by creating a unbiased methodology to  project all the scores on same axis – that is by aligning the scores of different board examination.

Before getting into the details of approaches to accommodate the bias, let us list down few business scenarios where we might have to do similar task.

Competitor Analysis:

a) An auto finance company gets number of request for refinance of auto loans. In this situation, the company would have data of interest rate charged by previous financier. It would have complete application data and bureau data for the customer. Interest rate charged is function of application data and internally developed risk score based on bureau data.  Now if an auto finance company is able to align its internal risk score with competitors risk score, it certainly has an edge over competitor. The detail will be clear from the paper linked below.

b) A similar problem for insurance company:  Please have a look at this patented method for aligning two scores. It can potentially help company to align the premium it will charge with competitor’s premium. Unfortunately, like all the patent filing document, it’s not very easy to get hold of methodology at one reading. Allow me to digress for a while; but it would be really helpful if law enforces patentee to file an easy- to- understand document along with the usual patent filing. It’s understandable why patentees constructs the claim the way they do, it’s necessary to prove infringement.

There are various examples in competitor analysis domain where we would have to perform similar task. Other scenario could be,  lets say risk team has developed a risk scorecard that is one of the input for pricing. Earlier risk team used traditional FICO score, but in newer model they have used Nextgen score for internal risk scorecard. Pricing team applies rule on FICO score and Internal risk score. With new risk score, pricing team has to apply rule on Nextgen score and internal score, but they don’t have access to Nextgen score of previous customers to reprice them.

Having explained few examples where aligning score is of paramount importance, we have to solve the problem of aligning marks. The details would be there in next post.

Written by SK

October 20, 2009 at 7:05 pm

When to buy flight ticket

with one comment

Sometime or other, we all , as customers, have been irritated to know that our friend/neighbour has shelled out less monies than us for same product bought through same channel.

We, as customers, felt cheated. In fact, when dynamic pricing was first introduced, consumers considered it a rogue method of doing business. Consequently, businesses lost business ( Though at that time also airlines were doing dynamic pricing. This could be explained as  a) Double standard of business morality set by consumers. or b) morality based on product (dynamic morality).

Currently, dynamic pricing is prevalent in number of industries. Airlines, hotels, e-business, Insurance, banking all rely on this model. There are different models to calculate the price; most basic of those is inventory based approach. For airlines, at broad level it would be function of seats available and number of days left in journey.

Anyway, the purpose of this entry is to introduce a start-up that came forward to help customer to decide when to buy airline ticket. Though the company is not very new, I came across it recently. A great idea that solves a real life problem and makes profit. The company is Live search farecast/ They have patented their model to predict price. Last year, microsoft acquired farecast, and now it’s Bing Travel.

Written by SK

September 15, 2009 at 6:50 am

The Importance of remittance

with 2 comments

The remittance account about thirty five percent of NSDP (Net state domestic product) of Kerela for year 08-09.  The figure 35% of NSDP was really a surprise for me, so I did my research to understand this.  Before I write about my findings about remittance on Indian and global perspective, have a look at few more figures on importance of remittance for Kerela.

  • Kerela has more than 27 emigrants per 100 household.
  • Remittance are sufficient to wipe out more than 60% debt of state.
  • Remittance were 1.74 times revenue receipt of state, 7 times of the transfer to the state from central government and 1.8 times of annual expenditure of Kerela.

Statistics are based on 2005 remittance, taken from a research paper.

While for few countries remittance as percent of GDP goes upto 45%, for India and few other countires remittance as percentage of GDP for year 2007 looks like this.

remit as percentage of gdp

Remittance inflow in million $.

Remittance $

Top countries from where emigrants are sourcing money to their native country.

Remittance outflow

Now have a look at the remittance trend for India. Notice how the rate of increase of remittance is increasing. The remittance amount is in million US $. The detail can be found at World Bank published data.

Remittance trend

I couldn’t lay my hands on remittance received by all the states within India, and how much remittance contribute to their NSDP. Will publish it when I get them.

Written by SK

August 18, 2009 at 7:20 pm

Interaction Variable

leave a comment »

Suppose in a laboratory experiment, we are trying to figure out sweetness of tea as a function of variable ‘quantity of sugar’ and ‘frequency of stirring’ .  With full day of methodical experiment we entered data in your experiment book.

 Sugar Exp

Now we run a linear regression analysis on our data to get following equation with adjusted R square value of 0.90 .

Sweetness = 1.93 sugar + 5.37 stirring freq – 6.43

Though adjusted R^2 is good enough, we create one more variable sugar*stirring freq. We run the regression model again with assumed relationship  Sweetness = c1 * sugar + c2 *sugar* sf + c3 .

Eureka!  we see that now we have the adjusted R^2  of 0.9966 .

Now the valid question would be, how could you have thought of adding a variable like sugar*sf ? True the choices are plenty,  we could have sugar*sugar* Sf or sf*sf .

The answer lies in exploratory data analysis. It’s always very helpful and insightful to plot all the explanatory variables with dependent variable, and see how do they change with respect to each other.


 Looking at the plot above, it would not have been difficult to try the equation that we tried to get such a good result. Varibles like these are called interaction variables. Think of the experiment we just tried, it seems logical that stirring would have more effect on sweetness when sugar quantity is high and vice versa.

Though a good look or understanding of explanatory variables is best guide to create an interaction variables, but when there exists higher order interaction effects, it gets cumbersome. Automatic Interaction detector or CHAID are statistical methods that have been developed to save us from this strainful mental exercise.

A real life examples:

  • Sale of Opera ticket: A statistical profile of opera ticket buyers reveal that they are both highly educated and upper income. This information can be leveraged to build a model for opera ticket buyers, but as we know all upper income segment are not highly educated; nor all highly educated belong to high income strata. In this case we would like to have a third variable that reflects the fact that a person is both highly educated and upper income. This third variable is interaction variable.

Written by SK

August 9, 2009 at 12:54 pm

Forecasting value of a Real Estate Property

leave a comment »

“Real estate is  the safest and most rewarding investment” . This idea has been thurst into our understanding like devine words. No one ever dared to question, fearing it would be sacrilegious, and would amount to utter stupidity till recent financial turmoil.

The assumtion that real estate value increases perpetually was one of the basic mistakes that led to this crisis. Now the fundamental has been jolted, there is a need and urgency to develop a model to forecast house price. How can we understand this variable with other macroeconomic parameters.

Let me digress for a while, for two paragraphs.

What do you think house price would be leading, co-incidental or lagging indicator. To me, at first thought, it should be co-incidental.  Moreover, It would rarely be used as an economic indicator to forecast economy. Real estate market is so illiquid, and data points are so less that it would not be too wise to use it as independent variable. Nonetheless research has shown that there is high correlation between  REIT ( Real Estate Investment Trust) index  and S&P 500 stock index.

Secondly, the most basic logic in favor of perpetual increase in real estate value I hear is that population is increasing, and everyone needs a home to stay, so price of land has to increase, and so of house. There is no denying to this simplistic logic. Simple logic more often than not are the most valid logic, but here there are number of other factors.

Now coming back to forecasting real estate value. It would be better if we first go through common and current methods of valuation of  real estate property.

The four common methods to value real estate:

1: Cost method: here the value is determined by replacement cost of improvements plus an estimate of the value of land. The replacement cost os relatively easy to determine using current construction cost, but valuation of land is a tricky business.

2: Sales comparison method: this method uses the price of similar property or properties from recent trsactions. Prices from other properties must be adjust for chracteristic unique to to each property and market condition. This method requires comparable sales data. If we have good data points, more comprehensive approach would be to go fo hedonic regression, where specific characteristic of properties are quantified.

3: Income method: This method as name implies uses the discounted cash flow model, that is present value of future income.  NOI is net operation income from property, and so value of property is NOI/r ; where r is estimated required market rate.

4: Discount after tax cash flow model. This is variation of above model where we consider marginal tax rate of investor as well. The net present value of an property equals the present value of after tax cash flows, discounted at the investors rate of return, minus the equity portion of the inmvestment.

Now going through all the methods of valuation, it is clear that more than us choosing the method, it’s method that chooses us depending upon the context, situation and availability of data.

Forecasting real estate values won’t be an easy path. We might have to use various methods, make number of assumption. In next post will discuss few common approaches economists have adopted for it, meanwhile pleae let me know how would you tackle this problem.

Written by SK

July 31, 2009 at 1:50 pm

CPI : Corruption Perceptions Index

leave a comment »

Transparency International (TI), a civil society organization, comes out with a less famous CPI every year. Compared to the popular one ,Consumer Price Index , this CPI – Corruption Perception Index-  is more subjective. While the calculation methodology of Consumer Price Index seems intuitive to even an uninitiated, the methodology for Corruption Perception Index raises few questions to an inquisitive mind.

The first observation that has to be made is that this is Corruption Perception Index. It’s a perception. So while keeping in mind that this is a perception –  a subjective index – we must ask, How TI tries to make the composite index as objective as possible.

CPI, for the purpose, is defined as abuse of public office for private gain. It’s a composite index. The CPI 2008 draws on 13 different polls and surveys from 11 independent institutions.  All institutions provide ranking  and overall extent of corruption of countries they operate on.  The definition of corruption being same as defined above, and survey excludes cases such as political instability and nationalism.

CPI 2008

Sources of Data

TI receives score of corruption from all surveying institution. All survey result providers are not considered at par. While data compiled by African Development Bank, Asian Development Bank, CIPA are taken as it is, as they regularly analyze a country’s performance, crosschecking it with peers; the data compiled from IMD and PERC are averaged out with last year to reduce abrupt variation in scoring from random effects.

Type of questions

ADB, AFDB, CPIA by World bank ask for ineffective audits, conflict of interest, policies being manipulated by corruption on a scale of 1(bad) to 6(good) . BTI asks question such as to what extent Government can contain corruption or to what extent public is tolerant to official corruption. Likewise all surveys compiles data on extent of corruption on some scale.

Sample Design

While all sources provides data on extent of corruption, the sample design varies from institution to institution. For ADB and AFDB sample is foreigners having business experience in local country to avoid home-country-bias. IMD, PERC asks question from local residents or expatriates.

Now one may raise concern that there may be a problem of circularity, that is previous year score might affect respondent response.  TI tested this hypothesis in year 2006, and it was found to be not true.

Standardization of data

Written by SK

July 28, 2009 at 5:52 pm

Humor, Sarcasm from & on Silicon Valley

Let's have a laugh together

Product Thinking

Peeling the layers of products that delight is the best place for your personal blog or business site.

%d bloggers like this: