Data disruption in trucking

Posted in Analytics, Data visualization, Decision Making, Disruption Opportunities, Internet of Things (IoT)

Global trucking revenue pool is close to USD 2 trillion dollars which is about 20X the cab market revenue pool. Even in the developed market such as US, it is a highly fragmented and antiquated business which lacks use of technology and data.

If you are an aspiring leader in technology and data, this is the place to be for the next 5-10 years for the following 3 reasons:

  1. It is a large and highest growth market to create impact, second only to goods commerce. More the internet commerce happens, higher the need of logistics and trucking to move goods. The next Amazon and Alibaba will come from Supply Chain technology and data disruption.
  2. The technology and data play is only starting to begin. Data availability is exponentially increasing through GPS, smartphones and IOT sensors.
  3. The problems are far more challenging and futuristic. It requires interplay of automation using IOT/driver assist systems, advanced mathematics/algorithms, and high quality UI/UX to exponentially increase adoption. Many other sectors don’t offer such a wide range and depth of problems.

Rivigo is leading the wave of disruption in trucking through a combination of the following factors

  • Unique operational ideas based on driver and network relay. First globally.
  • An outstanding leadership team across business, operations and technology
  • A strong and unflinching belief in the power of data

Rivigo has already attained a high quality business scale in India and aspires to build solutions which are applicable globally. In the truest sense, it has the potential to do what Amazon and Alibaba have done to commerce, Uber has done to cabs and several other disruptors have done to large global markets. The next 5-10 years is going to be exciting and enriching – some of the sample problems Rivigo tech and data teams work on:

Network relay model

The driver relay model needs sophisticated technology to ensure that millions of trucks can run smoothly every month with several millions pilot changeovers. The underpinning of this technology is a network model that can predict estimated time of arrival, simulation models to predict vehicle arrivals, wait time optimization and driver performance and behavior. This model brings everything together from the network and creates a coherent stream of output to make the pit stop changeover process seamless and scalable

Fuel analytics and optimization

Fuel is one of the biggest operating cost in logistics and fuel pilferage is a rampant problem for any trucking company having fleet of vehicles. However, reliable technology solutions are not available at present to prevent pilferages as the values fluctuate and the data has to be processed real time for even small reduction in fuel value. A fuel graph is a volatile time series graph, very similar to some of financial time series models and requires both predictive and heuristic problem solving approach. We are building patented fuel technology involving many complex algorithms and data science models to improve fuel efficiency.

Resource allocation and optimization

In trucking any idle capacity – truck or the driver is a fungible capacity. You cannot keep less or more of capacity at any point in the network. This is a massive problem and requires queuing theory, linear programming and advanced mathematical modeling to ensure the system is optimized and balanced

Human behavior analysis

Good driving is at the core of making logistics successful. This means that every minute of driving across the network has to be monitored and analysed. The big data from past and current has to be constantly evaluated to determine and predict the driver’s behaviour. This needs to be done in real time to know how a driver is driving to make immediate corrective actions. Is the driver in control of the vehicle? Is the driver driving carefully? Is the driver driving cautiously? These are just some of questions that needs to answered to convert a qualitative system via quantitative model.


Geo analytics

All the trucks at Rivigo are fitted with several different sensors and IoTs. These IoTs generate massive amount of data that needs to be processed, consumed and analysed. The analysis and data science on this data turns Rivigo trucks into smart trucks. The smart trucks run on a geo-grid and we are building very advanced location analytics engine for constant monitoring and simulating intelligent events. We are building an artificial intelligence layer based on machine learning and deep learning approach for simulation such as demand-supply matching, traffic maps (imagine Google Maps for logistics), hotspot and density analysis.

Time continuum and visualization

Rivigo is building a time continuum of its key resources that will allow to predict and create performant and efficient logistic system. A time continuum is analysis and visualization of all that is happening during the lifecycle of the resource and is a solution that gets built after applying algorithms, intelligence and predictive behaviour on a time-series on huge quantities of data. This needs scalable real time and batch processing over big data.

Line haul planning

Line haul planning optimizes the plan based on historical demand, volumes and service time commitments. The planning model determines the number of vehicles required on each route and network in an optimized way such that the shipments can be routes in the most efficient way. This planning can also be used for processing center capacity planning and building sales strategy to optimize the entire network. This problem is inherently an LP problem with multiple optimization and requires very sophisticated approximation and heuristics to solve it.

Tech platform

One of our over-arching goals is bring 2 million trucks in India online in the next 3-4 years. We are building a high quality tech and data platform to bring the entire trucking commerce (fuel, service, brokerage, resale, financing) online to ensure higher efficiency, lower costs and data led optimization for individual truckers. This is an immensely exciting project being led by world class engineers.

The future will be better if we waste less and use less and less resources for more and more output. Rivigo’s core operating philosophy is based on this approach – through use of data we want to further gain the marginal efficiency to make the world of logistics as automated, efficient and safer as possible.

Please do reach out at if you have common interests.

Introducing Rivigo Labs

Posted in Analytics, Data visualization, Disruption Opportunities, Internet of Things (IoT)

At Rivigo, data meets logistics and magic follows. We are transforming the antiquated logistic industry and bringing it into the 21st century with process automation, driver analytics and data science.

Rivigo is re-envisioning the truck as a Internet of Things (IoT) platform with intelligent sensors that constantly interact with a real-time responsive logistics network. We use the IoT to assist in integration of communications, control, and information processing across logistics networks that focus on all elements including the vehicle, the infrastructure, and the driver.


The charter of Rivigo Labs is to create the next generation of data acquisition, processing and visualization tools that will drive change in the logistics industry. Some of the problems we work on includes network optimization, recommendations systems, end-to-end automation, human factor design, smart trucking systems and beautiful visualizations, all at tremendous scale. We are not only pushing the envelop in the logistics industry, but we are also generating cutting edge tools in IoT, data science and people analytics.

In nutshell, we are building next generation transportation data science!

How Happy Birthday is said on greeting channels today

Posted in Analytics, Data visualization, Marketing & SoMe

I was researching digital marketing and social media concepts on my birthday. And I could not stop myself doing some analysis on all the greetings that I received. I plotted 200+ greetings that I got via various channels on “Personalization” and “Convenience” axis.


“Personalization” reflects how much personalization is possible via a given channel. It is not about you but about the ability and common usage of a given channel.

“Convenience” reflects convenience of people to use a given channel to send greetings. I received a large number of greetings on Facebook and hence the convenience factor is high for this channel. It is clear that new social media channels provides high degree of convenience and allow us to use special occasions to be in frequent touch.

I should clarify – it is the channel that does not allow personalization. And not the people. Over a phone, you will talk more. Over a facebook greeting, you will be short. This is how I do. I write short messages on Facebook or whatsapp for greetings and find it very convenient to wish my friends.

To all my friends,  once again thank you for your wonderful wishes.

Overlay maps for superior visulaization

Posted in Data visualization, Ideas

Overlaying custom information or crowd sourced information where a location based decision has to be made is very interesting. I recently did an entry on eRealtor market in India and I feel the real estate websites can make use of such maps in different ways. The possibilities are many and can help provide differentiation.

Where else such a map can be used? What other location based searches that you do? Can this be used for hotel bookings and travel planning websites where customers can share their experiences? Imagine a map where you can clearly identify which is a good location to stay and which is not. Now zoom out at state or country level and you know which is a good place to make travel plans etc.

Entrepreneurship is #1 career aspiration today

Posted in Data visualization, Hiring, People & Culture

In April, I floated a survey on What do you want to be in your career and I received a great response from many folks within a week of sharing the survey. So much so that I had to purchase a basic plan on survey monkey to analyze the results properly and quickly.

In the 7-question survey, overall results from the four central questions are shown below.

1. What is you career development/growth aspiration?

36% folks want to become entrepreneurs followed by 35% folks who are looking for a new team or promotion to a new team. This is true for individual contributors and managers. This question allowed multiple-choices so do not try to add to 100%.

What is surprising is that almost 60% people are looking at new roles, new teams and new companies to meet their career aspiration. Very few people (31%) think of current team to be able to provide opportunities for their career growth.


2. What actions can help you meet your career aspiration?

The top three choices here are – talking to seniors and colleagues (66%), acquiring new technical skills (57%) followed by acquiring soft skills (55%). As expected, individual contributors care less about soft skills but most about technical skills followed by talking to seniors and colleagues. However, managers care less about technical skills compared to soft skills.

3. How frequently do you think about career development & growth?

91% people actively think about career growth while only 3% do not think about it. Those who are frequently thinking about career development are looking at entrepreneurship and new teams. Entrepreneurship is the top choice for managers and new team for individual contributor.

It will be interesting to correlate employee engagement and career development results. A recent Gallup survey showed 87% employees worldwide are disengaged in their jobs. I see a strong correlation here and it is plausible that career growth issues are responsible for some of the disengagement. But I did not ask a direct question about current engagement.

4. How often do you check careers of your peers, colleagues & seniors to get some ideas for career path?

88% people check careers of others and 63% does so very often. I think this also tie down to 66% folks talking to seniors and colleagues when planning their career growth.

I calculated Pearson correlation coefficient using joint probability distribution between how frequently someone thinks about career development & growth with how often someone checks others career. A result of 0.57 shows a fairly good correlation between these two variables. What it means is that someone thinking frequently about one’s own career is often looking at others career path.

If you need raw results, please leave a comment below.

Real state performance for housing market

Posted in Analytics, Data visualization, Ideas

The Economist published a study on appreciation of house-price in last few years. The data for most developed countries is available from 1975 but for India it is available from 2011 year onward.

So, if you are invested in property market in India recently or are interested in knowing how you property has appreciated in recent times, I have prepared a quick summary –

1. The nominal increase in house price in India is 25% since 2011.

2. Not surprisingly, inflation has eater away much of the gains resulting in just 2.4% price appreciation in real terms.

3. Year 2013 is significantly bad in India with 7% yearly decrease in first three quarters. However, housing price in US has actually grown by 7% during the same time.

house price

This measurement is in real terms meaning it takes into account the effects of inflation on purchasing power.The city wise data drill down is available for USA but not other countries. I guess, India is not really as thorough in collecting house-price data as other important indicators like price against rents (to gauge return on real state investment) and price against average income (to measure affordability) are missing.

Here is the direct link to the charts.


Which has more lines of code – Windows or Mac OSX?

Posted in Data visualization, Technology

If you ever wondered how many lines of code are there in your favorite app, checkout the infographics by Information is beautiful.

And yes, Mac OSX has more lines of code than any Windows OS! Do not miss the last application shown in the infograph.


One popular notion in software industry is that “the more code you write, the more bugs you end up with”. What does this data say about the quality of these apps?

Some of you may wonder if these apps can be written with much denser code. There are many reasons why that may not be practically possible – primarily to make applications extensible and maintainable. The other reasons for bloated code can simply be poor programming.

However, we will never know that. I don’t think a tool or benchmark exists that can tell a project manager if an application can be written with fewer lines of code. Even if it does, do you really care?

eBook and Big Six publishers

Posted in Book, Data visualization, Education

I recently completed analysis of Big six publishers to understand eBook market that they represent. I decided to summarize my analysis using my first ever infographics using piktochart. I will summarize my experience in detail later but it suffice to say that infographics tool have way to go before they are ready for wider adoption.

Big six publishers

Some interesting observations – while the revenue seems to be increasing, there is a decline in profit because of larger share of eBook. I have used data from 2012 when Penguin and Random House were still separate entity. The data is collected for various public reports available on internet.

What does the future hold for the bigwig publishers of eBook industry – now Big Five?

Facebook Page report – How much data can you really digest?

Posted in Data visualization, Marketing & SoMe

I was recently playing with Facebook page report and could not help noticing the endless number of sheets that are present in the downloaded Excel file.

The excel file contains 63 sheets with one of them having 75 columns of data. Most other sheets have 6-10 columns. Roughly 500 data points that are available to you to analyze and compare. Ironically, the sheet with 75 columns is labelled “Key metrics”!

too much data

I tried of thinking of ways where I would need information from all data points. I think this is too much information that is presented in a way that makes any analysis difficult. Can you really digest so much information? How do you deal with such large data sets? Ignore it? Create a coefficient out of it?

My topmost key information is “People Talking about This” (PTT) that encapsulates following actions by user:

– Like your page
– Posts to your page wall
– Likes, comments on or shares on page posts
– Answers a question you posted
– Mentions your page
– Tags your page

What interesting data points that you have found in Facebook report? Should Facebook cut down on some of the irrelevant data points that no one ever cares about?