Capital Markets Surveillance using Big Data and Artificial Intelligence

Recently, The Securities and Exchange Board of India ordered RIL to give up the Rs 447cr of gains made through the network on trades.  It also directed RIL to pay an additional penal interest of 12 per cent per annum from November 29, 2007.  RIL and the 12 front entities were also banned from accessing the equity derivatives market for a year.


Is this a case of Market Manipulation?   The regulators of Capital Markets, SEBI in the case of India, have been empowered to carry out regulatory functions, including Surveillance of Markets.  SEBI in turn also collaborates with Market Infrastructure providers, the Exchanges, and related players, to ensure effective Surveillance.

Purpose of Capital Market Surveillance

Below are some of the main reasons and aim of Capital Markets Surveillance:-

  1. Create level playing field for all Market players : Investors, Hedgers, Traders
  2. Prevent insider trading, market manipulation, abating unfair prices
  3. Aid Exchanges towards Regulatory compliance

You will notice that the RIL case happened in Nov 2007.  What could the Regulator have done to catch this case earlier?  Also, the advent of automated & algorithmic trading have made the job of the Regulator more difficult because of proliferation of Data.  What can be done to prevent drowning in Data?

In this Blog post, usage of Machine Learning to detect and flag all such incidents, and that too on at a quicker pace is being showcased.  This Use Case and solution for using Machine Learning was demonstrated in NSE Fintech Hackathon 2018.

Factual Analysis of the case using Market Data

Analysis was done on the historical equities and futures data for RPL equity for the entire Calendar Year 2007.  In graph below, the blue line shows the Price of Reliance Power shares for the entire Calendar Year 2007.  Notice on Left hand y-axis the price ranges from Rs. 70 to Rs 270.  The second line, with Red color, shows the Daily futures turnover over the same period, with the Right hand y-axis showing the range.


From the chart above a clear pattern with malicious intent can be detected.  In Oct-Nov, it seems that participants with insider information have bought low and sold high.  These transactions have been spread over two separate counters, viz., Equities and Futures Markets.

For the regulator, such type of malicious intents need lot of manual work to catch.  However such patterns can be flagged effortlessly using Machine Learning Algorithms.  Additionally, on a ongoing basis Machine Learning can analyze such patterns for all securities in the traded universe.

Spoiler Alert!

This section is Technical in nature and some Machine Learning know-how is a pre-requisite.  This section can be skimmed/skipped by general audience.

Features of Machine Learning Used

This scenario being elaborated here is case for Machine Learning of Supervised Learning type.  Neural networks usage in this case is not a proper fit as Neural logic favors cases of pattern detection where logic is not known to humans or very complex one, e.g., face recognition, voice recognition, etc.  Whereas in this case, domain knowledge/logic for such pattern detection exists and we know features to use.  This is also specifically because Regulatory aspects themselves provide the detection features to be used.  Gaussian Kernel algorithm is preferred because of non-linear boundaries, small n (features), medium m (training set).  We need to perform feature scaling before using Gaussian Kernel.  Alternative functions like Logistic & sigmoid, are not well suited for this case.  Also, this being, Convex Optimization problem, regression will be able to find global minima.  C value for Gaussian Kernel, which represents penalty for mis-classified training examples, can be optimized using training with different parameters.

The graphs below represent process of Classification of Normal and Abnormal cases using Machine Learning.  The pluses (+) represents Flagged points.   The left side graphs represents Linear Kernel with C (slack of 1).  Notice that one left most plus (+) has been missed.  This can be optimized by using C = 1000 as shown in right side graph.


Using Gaussian Kernel, along with boundary prediction on real data will give the actual analysis as explained in next section.  The model can be made more sturdy.  Also more features added in case more data points are available.

Output/Intelligence through the Machine Learning Process

The graphs below represent Classification of Normal and Abnormal cases using Machine Learning.  Below parameters are being used:

  1. MWPL (Market wise position limit) Daily Data in % terms on x-axis
  2. Rollover of Monthly Futures and Options positions on y-axis


In graph above, notice how some cases have been demarcated by a boundary and marked as plus (+).  These Flagged points are cases which the Exchange and Regulatory Compliance Teams can review further and take appropriate action.  This review can be part of EOD processing.  With huge number of transactions, EOD closure is getting elongated.  This is where Machine Learning can help cut-down detection time.


Artificial Intelligence, along with it sub-domains, including Machine Learning, have a critical role to play in Capital Market Surveillance.  The advent of Internet based trading and the proliferation of algorithmic training has lead to data drown.  Time has come for the regulators and market players, including exchanges, to ask for “help” from Technologists.  Artificial Intelligence has been in existence since some time.  However with more wide spread availability of:

  1. computational power
  2. data storage and usage through Big Data systems for  distributed data including unstructured one,
  3. Logic for formulating predictors/features/factors that have economic value in short, medium, long term (Machine Learning)

not utilizing these as part of Business as usual processes will lead to competitive disadvantage.  Also in growing economies, fair play will lead to fair markets and thus widespread financial inclusion.  Time has come for regulators and exchanges to embrace Machine Learning in their day-to-day IT Operations.