There are two principal ways graph settings underpin various aspects of statistical Artificial Intelligence, which is typified by advanced machine learning deployments.
Firstly, they’re perfect for discerning which data are apropos for engineering these cognitive computing models. Secondly, they’re ideal for a number of machine learning techniques (such as Graph Neural Networks and clustering) that work best when deployed in graph environments.
However, when utilizing knowledge graphs for these connectionist AI approaches—termed “graph AI” by Katana Graph CTO Chris Rossbach—it’s important to realize that not all graph platforms are equal in their ability to implement these two advantages.
“If you look at the evolution of graph databases over the past 15 or so years, I think it’s fair to say that most of them began as systems that supported some form of path query, some form of transactional access that was customized for graph,” Rossbach observed. “Most of them eventually evolved support for analytics routines and subsequent support for things that are starting to look a little more like graph AI.”
There’s a big difference between opting for one of these solutions that began as transactional in nature and one that was expressly created for the sort of analytics that characterizes graph AI. A brief retrospective of the history of these graph frameworks illustrates this vital distinction in graph technology for maximizing the usefulness of AI in graph settings.
Check out what books helped 20+ successful data scientists grow in their career.
The Emergence of Commercial Graphs
Graph technologies have been in use for quite some time, dating at least as far back as the 20th century. However, their present popularity, which is demonstrated most often in the contemporary moniker ‘knowledge graph’, is rooted in developments occurring in the first decade of the present millennium.
“People have been writing and researching graph algorithms and learning how to do path queries for many, many years,” Rossbach recounted. “Graph query is also called path query. But it was not until Neo4j came out that these ideas were sort of embodied in a commercial system and labeled a graph database.” That vendor, and most of the others that populated the graph landscape in the ensuing years, specialized in transactional processing.
According to Rossbach, this approach works well for “OLTP use cases where there’s an engine that’s optimized for lots of small, update heavy database queries that are focused on this graph model.” However, it becomes less effective for enterprise scale analytics, the type of which directly supports modern graph AI.
Graph Analytics Systems
There’s a world of differences between transactional graph platforms and those expressly designed for analytics, particularly when it comes to graph AI. The latter is ideal for a number of workload requirements for graph AI, including things like graph query, graph mining, and graph analytics. “While most of our competitors are really graph databases trying to become graph analytics systems, Katana was originally designed as a graph analytics system and supporting graph query is a journey we embarked on when we got started as a startup,” Rossbach remarked.
A single analytics-primed platform for issuing queries and mining graph data is useful for exploring training data for models, generating machine learning features, and enhancing the inherent signal in those features to make them stronger. “The way you design a scale out graph analytics system is, I would say, fundamentally different from how you design a system if you’re starting from this assumption that what we really want is transactions on graph,” Rossbach reflected. For example, the former invokes High-Performance Computing methods to rapidly perform the above functions for graph AI at a speed and scale that’s impossible without this methodology.
Graph AI Today
There are interminable use cases for the sort of graph AI one can do via the approach Rossbach articulated. Louvain clustering, for instance, has been gaining traction in the fintech space. The design of transactional graph systems “is very, very difficult to extend to do things like Louvain clustering or other classic analytics algorithms like DFS, or breadth-first search, or betweenness centrality, because these are things that iteratively touch the entire graph over and over and over again,” Rossbach pointed out. “So, you need a very different kind of engine.”
Engines within native settings for graph analytics can fortify each of these use cases that Rossbach mentioned, which either involve or directly support graph AI by determining suitable training data for models and generating their features. Such engines also support several other types of analytics routines offering these same boons. Consequently, learning whether or not a graph platform has these underlying capabilities is a vital consideration for those looking to avail themselves of the current hype around knowledge graphs and their merit for statistical AI approaches.
Featured Image: NeedPix
Jelani Harper is an editorial consultant servicing the information technology market. He specializes in data-driven applications focused on semantic technologies, data governance, and analytics.