Karma Flows White Paper

24 August 2023

Karma–Start with Why

In this white paper we formulate and implement a mathematical representation of the social phenomenon of “Karma” and its flow through society on a global scale. As per our definition of Karma on the cover page, we formulate Karma as “the sum of someone’s actions; a characteristic emanation or aura that infuses someone or something.” This means capturing and aggregating actions or events and the energy these actions carry. Critically, this goes beyond merely capturing the tone or sentiment attributed to these actions. In this context, Karma is an allusion to the energy that flows through society, resulting from the actions of people, organisations and nation states. Karma then is “Sentiment with Impact.”

The core idea stems from algorithmic trading. Traders already use sentiment analysis with more success than trading just using econometric factor models (Sentiment, n.d.).

The hypothesis is that the world of trading is as much a social phenomenon as it is a manifestation of quantitative econometrics. This is only natural in that, penultimately, all investors are humans or are represented by humans. Yet merely looking at the sentiments expressed in the news about a particular financial security may not be conclusive. Few isolated sentiments move the entire market. Most stocks, bonds and other securities move only partially dependent upon news concerning them directly and instead move in relation to market sentiment, as well as sector sentiment, for instance, the commodities sector, the banking sector, etc. Variations in this collective sentiment might be deemed a manifestation of “social volatility.” Are sentiments then an expression of econometric models, or are econometric models an expression of sentiments? Or is there a circular dependency between these concepts? If the norms in a society or professional community give rise to mathematical models and their ubiquitous use, and the use of these models gives rise to sentiment in that society, then the overarching dynamic has moved from mere circular dependencies to one of reification.

The analogy evokes images of Penrose stairs and the manifestation of Strange Loops (Hofstadter, 1979).

Indeed, Emanuel Derman at Columbia University makes precisely this argument when he explains in his paper “A Stylized History of Quantitative Finance” (Emanuel Derman, 2022) how finance has progressed from trading securities to trading volatility as an asset class (reification) and how finance borrows models rooted in physics to model social behaviour in a manner that is neither absolutely true nor even accurate. He states that social standards in the world of finance have reified market models to produce unreliable financial theorems. If this is true and we believe it is, then attempting to model the social forces driving volatility is key to modelling financial markets as a whole.

If modelling markets based on social forces rings of “soft science” in comparison with the established mathematics of econometrics, it might help to recall that quantitative finance and econometrics have an extensive history of interdisciplinary borrowing. In his paper “The 7 Reasons Most Econometric Investments Fail” (Prado, 2019) by the renowned financial mathematician Marcos Lopez de Prado, who testified before U.S. Congress in 2019 on the impact of Artificial Intelligence on Capital Markets, teaches how finance is rooted primarily in Biostatistics and Chemometrics and how the “Econometric Canon” (Econometrica, 1933) failed to bridge the gap between professional mathematicians and statisticians — whereas biology did bridge this gap (Biometrika, 1901). We conclude that what is oft regarded as “soft science” may exhibit more rigour than what is perceived as a profession rooted in hard science. This may be true more so where that profession attracts highly lucrative salaries from and hires from the domain of the leading “hard sciences,” such as physics.

We return to contemplating what sentiment represents as a market mover. In order to move something with inertia, one needs force. How might we formulate this force? What is it? What comprises it? What is Karma? We are making an interdisciplinary leap. Our leap is between a social phenomenon and the laws of science: Under Thermodynamics, all energy build-up in a connected system results in the transfer (flow) of energy along the lines of connection. Is there any precedent for formulating a social phenomenon in such terms?

Alonso Pérez Pérez from the Mexican Center of Innovation in Ocean Energy speaks to this topic in his work entitled “The Social Energy: Contexts for Its Assessment” (Pérez, 2019). Pérez recalls how “Laplace established calculations of the probability that a quantity lay within certain limits” and how the French utopian philosopher Charles Fourier “took that principle—first used in the analysis of the voting procedures of some juries in Europe—and extended its application to mass social phenomena.” He points to Fourier’s commitment to a comprehensive theory about the way humans relate to each other. Perez raises the question if “the notion of laws (such as Newton’s) could be found in societies.” To support his argument, Peres points to the work by Professor Ian Hacking (Hacking, 1990) where interpretation of probability (as a proposition about the stability of mass phenomena and the incorporation of the “law of errors” of observational astronomy) occurs in measuring population characteristics. Professor Hacking was the recipient of the first Killam Prize for the humanities in 2002, the Social Sciences and Humanities Research Council gold medal in 2008, Norway’s Holberg Prize in 2009 and, in 2014, the Balzan Prize from the Zurich-based Balzan Foundation and was named a companion of the Order of Canada in 2004. We are encouraged to continue our quest.

Bootstrapping Karma

Let us recall our definition of Karma.

Karma: [noun] the sum of someone’s actions;
a characteristic emanation or aura
that infuses someone or something

In looking at this definition, we may contemplate some of the key elements.

Sum - We need the ability to aggregate.
Someone - This need not be a person. It may be a corporation or an organisation. Indeed, it may not be an “embodied” entity. Art has an influence upon culture. Culture has an influence on people. People do things. If we seek to capture this dynamic, entities need to include abstract concepts and themes. We will need an entire taxonomy that represents society.
Actions - These represent events; the things the people do. People do things to others. This suggests we need to represent associations between entities. Entities and their associations are mathematically represented by graphs and are more widely represented in Mathematics by Graph Theory. In today’s society, actions are recorded by news media and the dissemination of news, chiefly on the internet. We will need the ability to take on board global news feeds or pre-processed databases of such news feeds.
Emanation or Aura - These may be represented by the tone or sentiment the actions evoke.
Infusion - Sentiments are ephemeral unless they make an impression, not merely on the recipient of an action, the outcome of the event, but on those registering the sentiment expressed by third parties. For example, an investor who puts their hard-earned dollars into a pharmaceutical stock in response to the euphoric article by a journalist detailing how the company behind the stock just invented a cure for a previously incurable disease, has been infused with the journalist’s sentiment. A similarly euphoric article by a local sports writer about a minor league sporting event, though it may be equally euphoric, may see no action at all from the same reader. This suggests we need to gauge the gravity of events and actions and the “echo” which they create.
Characteristic - the overall purpose of an index is to remove characteristics and condense them into a unified measure. We note this concept in that if later we want to make predictions based on what is bound to be a large data set, we may wish to form lower dimensional embeddings of Karma to enable not merely interpreting what has happened but what will happen and perhaps what has happened already but was not reported.

The essence of Karma is “Sentiment with Impact.”

Formulating Karma

In the context of society, we regard Karma as the social energy flow that compels entities towards action. We observe similar dynamics at stock exchanges where order books collate price levels with bids and offers representing the sentiments of traders and which distil towards what is called the base weight for the instrument being traded. When price pressures build sufficiently in the direction of either bids or offers, prompting a shift in the base weight, then trade events occur. In physics (thermodynamics) all energy build-up in a connected system results in the transfer (flow) of energy along the lines of connection. In human society, sentiment represents emotional energy which precipitates action. In aggregate, that action becomes predictable. It is this we model as Karma and its flow through society.

We deliberated extensively on how to represent Karma formulaically. Our first, naïve, attempts at building a formulaic index centred around a product (multiplication), akin to electrical engineering where work performed in watts is a product of pressure (voltage) and flow volume (amperes). Our product was one of 1) event tone, 2) event severity/gravity and 3) connection strength between entities on a graph sharing the event. Factors had broadly normalised ranges in order to give comparable weights to each. Severity/gravity is according to the Goldstein Scale (Goldstein, 1992).

While an increase of each input to the formula led to a proportional increase in our Karma value, the approach resulted in a multitude of issues with interpretability of the results.

Chiefly, in affixing a normalised scale to tone, multiplying other factors by a tone value smaller than one diminishes the result in that multiplication. The operation amounts to division. Should the scale then range from 1.0 to 100.0 or 1.0 to 10.0? If the latter, then all tone values in the lowest 10 percentile will diminish the other factors in the model. Without inherent valuations, there is no right answer. Whereas interpreting wattage as a product of only two factors is intuitive and broadly scales as a square, variations in our model scaled exponentially with the number of dimensions, producing sometimes very large variations in the output. 1x1x1 equals 1 whereas 10x10x10 equals 1000. A 10-fold increase in inputs across the board has resulted in a 1000-fold increase in outputs. Logarithmic scales can be similarly nonintuitive.

The approach we settled on leans on how variance is stated in finance based on Pythagorean sums. This approach yields a much smoother signal owing to better scaling of variance than the exponential scaling of the product-based approach.

Our final formulation for Karma looks as stated below:

We deem the flow of Karma to occur between entities in their association. Sometimes, we may wish to assess the Karma of an event such as it may occur at a location outside of the context of its associations. For this, we define Event Karma according to the “Conflict and Mediation Event Observations” (CAMEO) system (Gerner, 2002).

Event Karma

Both formulations of Karma share the use of the Goldstein scale for severity/gravity and tone as sentiment but differ in their interpretations of strength or what we have previously referred to as infusion or “echo.” Karma Flow and Event Karma are therefore not directly comparable.

Sentiment Bias

The Karma Flows platform also identifies the sentiment bias of any data set for which Karma Flows undertakes an analysis. The news sphere has a proverb: “If it bleeds it leads,” suggesting that negative news events generate more traction than positive news events. Interpreting Karma should be done with a view towards the overall bias of the data corpus and similar corpora - either for different regions or different time frames.

Figure 4 Sentiment Bias Formula

Karma as a Non-Directional Concept & Aura

Our present formulation of Karma is non-directional. Karma constructed from an event ‘E’ occurring between two entities ‘A’ and ‘B’ assigns neither credit nor blame to either. No judgement of right or wrong occurs in Karma Flows. Crucially, this means that a hero and a villain engaged in a singular struggle will be accorded the same Karma—the aura emanating from that struggle itself. For an entity to be accorded a positive Karma, it must collect positive associations. We call the sum of these associations Aura.

The Karma Flows System

Setting the Scene

Having introduced the subject of ‘Karma’ and ‘Karma Flows’ in the context of quantitative finance and trading, our objectives are broader than finance. Karma Flows aims to retain its roots in finance yet find its place in the sphere of investigative platforms such as

Altamira Lumify - threat & security analysis investigative platform (Altamira, 2017)
Aleph - journalism analysis investigative platform (Systems, n.d.)
Open-Cyc & Cyc.com - Knowledge graph reasoning system (Cyc.com, n.d.)

Karma Flows aims to fuse the concepts embodied in the above platforms with modern machine learning techniques, in particular graph deep learning, and temporal graphs to invert the relationship between “Human Driven & Machine Assisted” to “Machine Guided & Human Evaluated.” Within this context, we aim to embed our concept of Karma.

Data Considerations

Data wrangling is the core of any data science project. Below are some data considerations central to Karma Flows.

As previously noted, we require a global news source. There are numerous sources for this type of information. If you work in finance, then chances are you have heard of “Bloomberg Terminal.” (Bloomberg, n.d.) Bloomberg’s rival firm Thomson Reuters provides a competing product in its Data Fusion (Reuters, n.d.) offering. What both offerings have in common is the amalgamation of modern machine learning techniques and graph theory. We will do likewise, integrating the concept of Karma and augmented in particular with graph deep learning.

Other sources of global news events include:

The Global Database of Events, Language and Tone (GDELT, n.d.)
Common Crawl News Crawl (Crawl, n.d.)
Google News RSS (News, n.d.)
Proquest Global News Stream (News Stream, n.d.)

The above is not an exhaustive list. Our imperatives are:

Global coverage
Intra-day resolution
Extensive historical data
Pre-processing (tone, knowledge graph, taxonomy)
Named entity recognition (NER)
Access to raw data to override pre-processing as needed

As in numerous projects, there is no right answer to the selection of a solution but a balancing of tensions. Data sets are optimised for specific tasks and, when put to a different purpose, they must be re-engineered and transformed. Data scientists typically seek to minimise that transformation.

The Karma Flows Framework

The Karma Flows framework comprises:

The GDELT knowledge graph and event database (GDELT, n.d.)
The Goldstein Scale for Event severity/gravity (Goldstein, 1992)
Conflict and Mediation Event Observations CAMEO system (Gerner, 2002)
World Bank Group Topical Taxonomy (World Bank, n.d.)
World Press Freedom Index (Frontieres, n.d.)
SEO Ranking for source validation (Moz.com, n.d.)

We require filtering capabilities to exclude fully censored news and propaganda, such as emanating from contemporary dictatorships. All sources of information are ranked against the World Press Freedom Index. On this premise, the media of some otherwise large countries are discarded outright; data must contain information in order to be useful. Personal blogs may enjoy popularity, but will have enjoyed little or no editorial vetting. Are they authoritative? These considerations inform our approach to sampling what is a large data set. We have curated and labelled numerous thousands of themes according to economic relevance, sector relevance e.g. commodities, as well as relation to the domains of military and government.

The Karma Flows framework is shown below:

Karma Scope Computational Engine

Karma Flows is centred about its core computational engine called “Karma Scope.” This fulfils much the same purpose as an edge detection algorithm in visual image processing.

Figure 6 Edge Detection on Visual Image Processing (Detector, 2013)

Edge detection in visual image processing reveals features in an image at the expense of detail. We may regard this as dimensionality reduction. Only two shades of colour have remained in the above image. Intermediate shading is removed. Only shape remains. Karma Scope does the same for a temporal event graph connecting concepts and entities with Karma forming the dominant edges.

Karma Scope reduces something uninterpretable like this …

Figure 7 Global News Data Volume Visualised (GDELT, n.d.)

… to something interpretable like this:

Figure 8 Making Global News Data Interpretable through Karma Dominant Edge Detection

Karma Scope - Graph SAGE Attention Mechanism

Karma Scope itself can operate independently or guided in its operation by GraphSAGE (William L. Hamilton, 2018). In guided mode, Karma Scope takes cues from GraphSAGE as to where to focus its attention. Karma Scope then “illuminates” areas of the graph identified by GraphSAGE link prediction. This mechanism allows unseen relationships to be identified and their context to be investigated. This mechanism is the core of the inversion from “Human Driven & Machine Assisted” to “Machine Driven & Human Guided” investigative capabilities of Karma Flows. It is not enabled in analysis reports posted publicly.

Karma Flows in Action

Interactive Ensemble Clustering for Graphs

Karma Flows introduces interactivity to Ensemble Clustering (Valérie Poulin, 2018) to assist with the discovery of latent communities. The concept is like clustering, where hidden similarities are used to identify related members of a community - so called “cliques.” Who is most closely associated with whom? And what themes unite each clique? Owing to Karma Scope’s regulariser, only significant relationships remain in this feature. Karma Scope has pruned insignificant relationships from the graph. We may think of this like “tip of the iceberg” or just the prominent ridges on a topographical map.

Shown below is a sample analysis for the United States for the month of May 2023.

Figure 9 Karma Flows Ensemble Clustering

Node colours & clusters show latent community membership. Blue edges denote positive Karma. Red edges denote negative Karma. Intermediate values range in shades between red and blue. Click on nodes to reveal ranking order within latent communities. Click on edges to reveal associated news artefacts and Goldstein ratings. Wider edges denote stronger connections. Nodes can represent people, places, organisations as well as themes and taxonomy concepts. Themes and concepts are shown in brackets.

We may zoom in and select a cluster of like colours - here associated with the theme “education.” Edges are shown in blue suggesting positive Karma values.

Selecting one node within this cluster, we obtain the relevant clique and its leaders - refer to “clique_ranking” overleaf and on the right. Members are shown in the order of influence they hold within the data set. Artefacts allow a “drill down” on events and news stories which contributed to the association between Georgetown University and the theme of education. Note that titles may be misleading in that an apparently positive news title says little about associations between actors and entities within the article. Likewise, a positive association alone does not make a direct statement about the associated entities, only the association.

Selecting the theme which binds all associated universities allows a comparative analysis of their Karma during the period. At each stage, we may click the buttons to save our analysis to disk.

Figure 12 Comparative Karma Analysis for Entities Bound by Common Theme

While the Karma for this part of the graph is blue, most edges for the period appear in red, suggesting a dominantly negative outlook for the month. Karma Flow explicitly quantifies this trend in each analysis as “News Sentiment Bias.” Shown below are the analysis headers for the United Kingdom, Canada, Australia and the United States for the same period.

Table 1 Comparative Sentiment Bias for Different Countries

As suggested, the dominant bias in our sample analysis is negative. It is easy to see why. Looking at the largest entity labels in the space of red associations looms Russia and its conflict with Ukraine. Label size corresponds to the degree of connectedness of an entity.

Figure 13 Exploring a Relationship Between Entities

Let us explore this relationship.

Figure 14 Exploring an Entity and its Associations

Having selected the node labelled “Russia,” we may immediately observe the following themes and concepts: crisis and safety, crisis and death, leadership and politics. This is unsurprising. In looking at the clique and clique rankings, we note the themes of MILITARY and ARMED_CONFLICT. There is a tight association between Russia, Moscow and Kiev, which is also unsurprising. The next association is with Beijing, China and the Kremlin - yet we observe that Beijing, China does not appear to have jumped out. It is subtly nestled among the other entities. When selecting China explicitly, however, a different relationship emerges.

Figure 15 Exploring a Nexus between Entities

The Karma flow between Russia and the United States has China as its nexus which, judging by the width of the Karma edge, is far stronger than the direct flow of Karma between the United States and Russia. Indeed, respective connections from China to the United States and Russia to China have identical Karma classifications, as well as other comparable graph edge attributes. Being able to elicit this kind of nexus is a key strength of Karma Flows.

We might submit the news artefacts associated with this relationship to a separate Karma Flow analysis to investigate this question further. For this purpose, Karma Flow provides a separate Neighbourhood Cloud component, which constructs a Karma Scope instance using a neighbourhood cloud along a path of connectedness that we wish to investigate. The idea is straightforward: The association of any two entities is best explained by the association of their relations. This step is assisted by deep learning graph link prediction.

Interpreting Edge Artefacts

As noted, graph edges in Karma Flow represent aggregate associations between entities. When interpreting the artefacts behind this aggregation, it is important to understand the manner of operation of Karma Scope. Each edge documents its artefacts in the right panel of the Karma Flows view, as shown overleaf.

In the above case, the titles appear intuitive and related to the entities under consideration: Ukraine and the theme MILITARY. Sometimes, the URLs and their titles will be nonintuitive for a context.

Consider the following fictitious scenario: Bob, a pharmaceutical scientist, gives a presentation about a newly invented medicine for a previously incurable disease. Among the attendees are two people: 1) Alice, a fellow scientist and 2) Jane, a notable member of the local community. Alice avidly asks questions during the presentation, whereas Jane sits quietly in the rear of the audience. An article written by a journalist mentions all three persons. The article is accorded a positive tone. Because Alice asks repeated questions of Bob, the article notes their close association. Natural Language Processing (NLP) registers both the strong association between Bob and Alice as well as the tenuous association between Jane and Bob as well as Jane an Alice. All three are tagged by Named Entity Recognition (NER) algorithms. Three associations are created: Bob—Alice, Alice—Jane, Jane—Bob. All three associations “inherit” the positive aura of the event. Realistically, there is little information regarding Jane’s involvement. Jane is but present at the event. We have data but no information. Without further data about Jane, we cannot say much about her. Most information about people follows what is termed a Gaussian Distribution or Normal Distribution. The mechanics of normal distributions are such that when a sample size of 25 is reached, the data conforms to a certain behaviour. For most data, a sample size of 30 suffices to establish a normal distribution. It is at this point that we see valid patterns about Jane. Her associations are aggregated and perhaps Jane attends all presentations in her town. Soon, an association between Jane and her town emerges in the graph. But now the artifacts appear less congruent than in the very clear example of Ukraine. Instead of titles about Jane and her town among the URL artefacts, we will see pharmaceutical articles and information about other domains. Nothing has gone wrong if we understand how the associations have been derived.

Exploring the Interconnectedness of Economic Drivers

Shown below are the key entities connected to economic performance and their respective Karmas for the period of June 2023.

Figure 17 Exploring Key Economic Drivers

News Data Graph Deep Learning Prediction Diagnostic Capability

A key question then is how much “predictive signal” may be found in public news data and has any prior research attempted to quantify this for the domain of graph deep learning? Shown below is the diagnostic capability of our graph deep learning link predictor for both training and validation data sets as a function of training epochs. Similar to the Receiver Operating Characteristic (ROC) (ROC, n.d.), we may use this capability to gauge the accuracy of our predictor.

Shown below is an example training history on Australian news data for May 2023.

Figure 18 TensorFlow Graph Deep Learning Accuracy on Public News Data

It surprised us to note the powerful performance of deep learning on public news data relative to the number of training epochs. Broadly, we may expect the above model to have a true positive rate of 75%-80%. When gauging individual predictions, we will additionally need to consider the probability assigned by the predictor. Hence, if GraphSAGE assigns a 95% probability to an individual prediction on our graph and the model is 80% accurate, then the likelihood of that individual prediction being true is 0.8 * 0.95, hence 76%.

No hyperparameter “tweaking” was performed, which suggests a stronger underlying signal in public news data than we had dared to hope and that additional gains may be attained.

Karma Flows - Centrality Measures

Internally, Karma Flows works with several measures of centrality, such as degree, PageRank (Rank, n.d.) , harmonic and betweenness. Of these, the degree of connectedness is most readily interpretable for members of the public. Degree is a count of connections for an entity. For each element, we are given an internet look-up to Wolfram Alpha (Alpha, n.d.), Google & Wikipedia. Internet look-ups are a convenient way to disambiguate named entities.

Centrality	Centrality

[Table 2 Karma Measures of Centrality]

If It Shines, It Leads

Positive Sentiment Priority View

In 2021, the Washington Post ran an article on Facebook’s anger emoji. The social media giant was alleged to have given five points for anger and one for a ‘like.’

Figure 19 Negative Sentiment Bias in Social Media (Post, 2021)

There is nothing novel about news negativity except that social media has put the phenomenon into our pockets by way of “Doom scrolling.” News media has always lived by the adage that “If it bleeds, it leads.” Negativity bias is an evolutionary survival mechanism. This suggests that Karma Flows’ ratings are to be interpreted in the light of historical sentiment bias.

Being cognisant of the built-in negativity bias in news media, Karma Flows offers the view of “If it shines, it Leads.” In this view, positive Karma edges are rendered on top, leaving the negative edges beneath. Karma edges are spread out rather than clustered according to their respective communities or ‘cliques.’ The data is the same, but its presentation favours good Karma. In statistics, there is rarely an opportunity to present matters in a good light with full transparency, in a good spirit, and without manipulating the data.

This is one such occasion.

Figure 20 Positive Sentiment Priority View

We may immediately ask. What do people feel best about? We move our mouse pointer over the nodes on the dominant blue edges. We note the themes of education and university. Indeed, nearly every blue line ends on some university. Amongst all the bad news, tertiary education in the U.S. would seem to be ‘carrying the day.’

View	Legend

[Table 3 Positive Sentiment Dominant Theme Analysis]

Karma Geolocation

Not everything that makes headlines in the United States, or any region pertains to something within that region, just as Ukraine is not in the United States. We, therefore, might wish to have a geolocation view of our data. Karma Flows provides this view as part of its Karma Geolocation view.

As before, we can zoom in and examine contributing news artefacts. Red suggests negative sentiment, blue suggests positive sentiment.

This provides a simple insight into ‘what happened where?’ But we note again that the news artefact URL is not conclusive. The last line in the pop-up suggests the event related to the provision of economic aid and a Supreme Court decision, while the URL suggests an initial publication date prior to the analysis period.

Figure 22 Exploring Karma Flows Geolocation Events

Aura Reporting

Aura emerges around any entity in Karma Flows when many independent connections emerge around that entity. Hence, it is statistically relevant and valid only for entities with a high degree of connectedness within the graph. Aura is simply the average of Karma Flows about an entity.

The reader will recall when we stated that no value judgement of events nor attribution of right or wrong occurs in Karma Flows. When all associations with an entity across a broad spectrum are negative, we may then infer that the Aura about this entity is negative also—at least in the localised context of the analysis and for the isolated time frame of said analysis.

We observe an entity’s aura, simply by selecting it in Karma Flows and zooming in. Shown below is a comparative analysis of country aura for New Zealand, the United Kingdom, China, Russia and the United States—drawn from Australian data for May 2023. What is informative is the colour spectrum of the Karma Flows connections drawn to each country. A blue dominated spectrum denotes a more positive aura. A red dominated spectrum denotes a more negative aura.

Aura	Aura

[Table 4 Country Comparative Aura]

Karma Flows aggregates the Aura associated with dominant themes and concepts in each analysis. Shown below are the dominant themes for the Karma Flows analysis for the United States in the month of June 2023, along with the Aura attributed to these themes.

What news themes were important to Americans in June 2023?

Table 5 Comparative Theme Aura & Compensated Aura

Karma Flows calculates a “Reference Aura” which denotes the average of all auras in the analysis and hence the negativity bias built into the news mediascape at large. Compensated Aura then is the individual aura as it stands out from the Reference Aura. We might also call this the relative aura. How does this entity stand out from its peers in the analysis?

Kernel Density Estimation and Karma Prediction

Karma Flows computes a Kernel Density Estimate (KDE) of the Probability Density Function (PDF) of the Aura of individual entities. This gives a shape and profile to each Aura and permits their comparison. In the diagram below, we may observe that the profiles for the United States and China show more resemblance than, for instance the profiles for the United States and Ukraine.

Figure 23 Kernel Density Estimation of Entity Aura

More significantly, given two entities and their Auras, we may use the Kernel Density Estimation method to predict the Karma for a hitherto unknown association between two entities by forming their joint KDE.

This allows us to answer the question of what a most likely Karma value might be if, for instance, we had the ability to predict the association between any two entities.

Minority & Majority Link Prediction

In a graph, “Link Prediction” is predicting the edges of a graph or predicting the properties of the edges of a graph. Traditional graph methods excel at computing measures of centrality and identifying latent groups or clusters within a graph. This enables us to derive leading influencers and gauge the associations between entities.

Graph Deep Learning goes one step further than this by learning lower dimensional embeddings of the highly dimensional data comprising the graph. This allows the prediction of associations between entities which are hitherto unrecorded—much in the same fashion as a machine learning of a time series allows the prediction of future values in that time series. In a graph based on event data, Graph Deep Learning affords us the ability to ask: “What either has happened already that isn’t reported, or what might be happening soon?” Since our graph models the aggregated gravity and associated sentiment of events, what we term Karma, Graph Deep Learning allows us to ask: “Where will something significant happen?” We will not necessarily be able to know what event(s) precisely will happen, but we can form a view as to their gravity and associated sentiment. And we may be able to infer the type of thing(s) about to happen (their theme). This in turn, may then focus on more exacting forms of analytics, which might otherwise be prohibitive to perform in an un-focused manner across a large domain. We use GraphSAGE to implement deep learning over the Karma Flows graph.

Karma Flows then produces two reports: ‘Minority’ and ‘Majority’.

Minority Predictions detail associations which GraphSAGE deems ought not to be in the graph. In essence, it disagrees with the original graph having formed this association. Whether fake news or a contradiction of a dominating trend, GraphSAGE deems the associations in the Minority Predictions to be false.

Majority Predictions detail associations which are not found in the graph, but which GraphSAGE deems ought to be there. The interpretation is that these are associations which represent events that have either gone unreported or which are about to happen. GraphSAGE deems these associations to be true.

The naming of these reports is a thinly veiled allusion to the 2002 science fiction film ‘Minority Report’ by Steven Spielberg and featuring Tom Cruise in which a team of “pre-cognitive” oracles was able to see crimes before they happened. According to the movie’s plot, a “pre-crime” division in the police force then apprehended suspects before they were able to commit the offense. A minority report constituted a disagreement among the “pre-cognitive” oracles. We thought the analogy poignant in that in our implementation of Karma Flows, the gravity & sentiment of events is being predicted where a better-informed model (GraphSAGE) disagrees with the traditional, lesser informed graph model.

Triangulating Karma Flows, Google News & Large Language Models

Modern science revolves around putting forward hypotheses and then testing them. Indeed, this might be regarded as the essence of the scientific process. Yet when a tool takes measurements of something which has no parallels, it becomes challenging to distinguish between what one thinks one sees and what one sees—especially true when one has made an interdisciplinary leap. A good friend and mentor of mine pointed this out to me.

In quantitative finance we have the concept of back tests, an algorithmic trader’s trusty old friend. Ours is a more complex situation. In seeking to verify Karma Flows and in order to measure if a predicted set of events with a given gravity and given sentiment has eventuated, we need Karma Flows, which is the tool we seek to test. This is circular. But we may make an attempt at “Triangulating” what is being reported. Shown below is an excerpt from the majority report of the Karma Flows analysis for the United States for the month of June 2023. Note that the list is sorted on probability, not Karma.

Table 6 Karma Flows Majority Predictions

We will take each in turn. Setting the scene, the above predictions are obtained from a graph with data for June 2023. The June data did NOT contain these associations, but GraphSAGE predicted them. At the time of writing, it is now the end of July 2023. What we might try to ascertain is if anything may be readily found to have occurred as per the top 10 search results (front page) in Google News for the entities in question. Google News is a reasonable safeguard against confirmation bias here in that the top 10 search results will contain matches, but not necessarily for the time window we desire. Search results from June 2023 do not count, nor those on page two of the Google search results.

As of 28 July 2023 - Notable arrests on the Ivory Coast in July

Figure 25 Prediction Triangulation - Ivory Coast

We note that the first two results do not match, but the third result details the arrest of a national from the Ivory Coast. It is not immediately clear why this incident might be reported in a national tabloid except that social media incidents, such as the one detailed, attract popular attention. At this point, we have triangulated the first prediction.

As of 28 July 2023 – Williams Institute. Is it “educational?” What happened in July?

According to Google, the Williams Institute is a gender issues policy research institute within the UCLA School of Law. This corroborates the label we see in Karma Flows. Tick. The institute is an education department at UCLA. In performing a search on Google News then for “Williams Institute”, we learn that 4 of 10 results on the front-page focus on a report from the Williams Institute released in July.

https://williamsinstitute.law.ucla.edu/publications/transpop-substance-use/

We have thus triangulated the second prediction.

Predictions 3 and 4 are particularly interesting as they are obviously related.

Our first and broadest triangulation is using Large Language Models (LLMs). Does the association make “common sense?” We use a tailored in-house Large Language Model which a) has been instruction tuned, unlike ChatGPT which is tuned for conversations and b) has been optimized to be less creative. The latter means that the model’s freedom to speculate has been diminished. We also utilise a corpus that includes a contemporary data set. At the time of writing, ChatGPT has been trained with data up to 2021. We need more recent data.

Shown below is the output from our in-house LLM on the subject of predictions 3 and 4.

Figure 26 Prediction Triangulation - Large Language Models

Interestingly, the model gives a detailed evaluation from both a historical perspective as well as pertaining to contemporary affairs. We conclude that the predictions rendered by GraphSAGE based on our formulation of Karma do appear to agree with the historical context as well as the contemporary context.

& 4) As of 28 July 2023—“Vostok, Tatarstan, Russia and Ukraine” on Google

The very first Google result appears as this:

Figure 27 Prediction Triangulation - CriticalThreats.org

CriticalThreats.org reports as follows:

https://www.criticalthreats.org/analysis/russian-offensive-campaign-assessment-july-1-2023

Figure 28 Prediction Triangulation - Russian Offensive Campaign Assessment

Considering the recognition status of the “Donetsk People’s Republic,” the classification of mercenary appears appropriate. Crucially, we triangulated all four predictions within our established parameters.

Finally, we refer to the section “News Data Graph Deep Learning Prediction Diagnostic Capability“ to show that our model has been able to converge and elicit information from the data based on our formulation of Karma.

Closing the Loop to Econometrics

Our journey started with econometrics and trading. We shall conclude in econometrics and trading. How might we use Karma Flows to place informed trades in the market? Successful trading relies on exploiting inefficiencies in the market. Moreover, trading strategies are judged on their risk-adjusted returns, not their absolute returns. The more we know, the better we can manage risk. This is the chief aim for gauging volatility. There are several entries in the World Bank Group Topical Taxonomy and others which concern themselves with related themes (Economic Policy Uncertainty). One strategy is to build an index on their aggregate Karma. This will be whole of market. What if we wanted to study specific sectors or the connection of an organisation, for instance a corporation, to these themes? What if there is no direct association? For this, we employ maximum flow analysis. This is defined as the maximum amount of flow that a network would allow to flow from one point to another.

Much like a road network with junctions and intersections will allow a certain number of vehicles to travel along both arterial roads and local roads along a variety of paths between a point of departure and a destination, so will a graph allow a maximum flow between two nodes of the graph, provided the graph has been defined with distances and capacity in mind. Therefore, we formulate our Karma graph in terms of connection strengths (distance) and sentiment gravity (capacity). This allows us to gauge how closely and directly any one entity is connected to certain themes and concepts (integral maximal flow). More detail about our implementation of this feature is available upon request.

Summary

This paper has introduced the concept of the flow of Karma as a form of societal energy comprising sentiment, event gravity and connection strength between both concrete and abstract entities. Our chief motivation has been to model volatility in financial markets, but we aim further afield into the realm of investigative reporting and threat analysis. We supported our motivation with arguments by Emanuel Derman at Columbia University who holds that existing financial standard are inadequate as they have reified market models to produce unreliable financial theorems and with arguments by Marcos Lopez de Prado at Cornell who posits that the “Econometric Canon” has failed to bridge the gap between professional mathematicians and statisticians. We then established precedent for our interdisciplinary leap of formulating social phenomena in terms of the laws of physical science based on the work by Alonso Pérez Pérez from the Mexican Center of Innovation in Ocean Energy.

We then presented our technology framework based on graph theory, graph deep learning and Large Language Models. Central to this framework is the fusion of our core computational engine ‘Karma Scope’ and the ‘Attention Mechanism’ afforded by GraphSAGE deep learning. With the aid of this framework, we can elicit dominant relationships in news data, determine leading influencers, uncover latent groups between entities in society, as well as predict relationships which reflect events that have either gone unreported or are likely to occur soon. We are also able to compute the negativity bias in news, as well as compute the connectedness of entities to economic themes for the purposes of algorithmic trading.

Finally, we triangulated our results with the help of Google News and Large Language Models to evaluate the efficacy of predictions based on our formulation of Karma as social energy flows.

The Chronicles

It is our intention to release monthly backdated Karma Flow analyses of a selection of countries and regions on our website. Stay tuned for their release!

Karma Flows Chronicles

More frequent, daily, or intraday and bespoke analysis will be available upon request.

Please contact chris@karmaflows.com

Patents

This technology described in this paper is patent pending in the United States of America, the United Kingdom and Australia as well as internationally through the Patent Cooperation Treaty (PCT).

Works Cited

Alpha, W. (n.d.). https://www.wolframalpha.com

Altamira. (2017, August 29). Altamira - Lumify . Altamira - Lumify : https://www.bloomberg.com/press-releases/2017-08-29/altamira-lumify-now-available-on-microsoft-azure

Biometrika. (1901). Biometrika. from Oxford University Press: https://academic.oup.com/biome

Bloomberg. (n.d.). https://www.bloomberg.com/professional/solution/bloomberg-terminal/

Crawl, C. (n.d.). https://commoncrawl.org/2016/10/news-dataset-available/

Cyc.com. (n.d.). Logic-Based Machine Reasoning. Logic-Based Machine Reasoning: https://cyc.com

Detector, C. E. (2013, May 05). https://imagej.nih.gov/ij/plugins/canny/index.html

Econometrica. (1933). The Econometric Society. The Econometric Society: https://www.econometricsociety.org/publications/econometrica

Emanuel Derman, C. U. (2022, December 9). A Stylized History of Volatility. A Stylized History of Volatility: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4297590

Frontieres, R. s. (n.d.). https://rsf.org/en/index

GDELT. (n.d.). A Global Database of Society. https://www.gdeltproject.org

Gerner, D. J. (2002, April). Conflict and Mediation Event Observations (CAMEO): A New Event Data Framework for the Analysis of Foreign Policy Interactions. https://www.researchgate.net/publication/ 2840364_Conflict_and_Mediation_Event_Observations_CAMEO_A_New_Event_Data_Framework_for_the_Analysis_of_Foreign_Policy_Interactions

Goldstein, J. S. (1992, June). A Conflict-Cooperation Scale for WEIS Events Data. The Journal of Conflict Resolution: https://www.jstor.org/stable/174480

Hacking, I. (1990). The Taming of Chance. USA: Cambridge University Press.

Hofstadter, D. (1979). Gödel, Escher, Bach: An Eternal Golden Braid. United States: Basic Books.

Moz.com. (n.d.). SEO Ranking. https://moz.com

News, G. (n.d.). https://news.google.com/news/rss

News Stream, G. (n.d.). https://about.proquest.com/en/products-services/globalnewsstream/

Pérez, A. P. (2019, May 28). The Social Energy: Contexts for Its Assessment. The Social Energy: Contexts for Its Assessment: https://www.intechopen.com/chapters/71102

Post, W. (2021, October 26). https://www.washingtonpost.com/technology/2021/10/26/facebook-angry-emoji-algorithm/

Prado, M. L. (2019, April 16). The 7 Reasons Most Econometric Investments Fail https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3373116

Rank, G. P. (n.d.). Page Rank. https://towardsdatascience.com/pagerank-algorithm-fully-explained-dc794184b4af

Reuters, T. (n.d.). Data Fusion. https://developerportal.thomsonreuters.com/forums/data-fusion

ROC, R. O. (n.d.). https://online.stat.psu.edu/stat504/lesson/7/7.4

Sentiment, B. (n.d.). Embedded Value in Bloomberg News and Social Sentiment Data. https://www.bloomberg.com/professional/sentiment-analysis-white-papers/

Systems, A. (n.d.). Data Project by data platform by Organized Crime and Corruption Reporting Project: https://docs.aleph.occrp.org

Valérie Poulin, F. T. (2018, September 14). Ensemble Clustering for Graphs. https://arxiv.org/abs/1809.05578

William L. Hamilton, R. Y. (2018, September 10). Inductive Representation Learning on Large Graphs. https://arxiv.org/abs/1706.02216

World Bank, G. T. (n.d.). http://vocabulary.worldbank.org/taxonomy.html