The Interconnectedness of Things

What Role Do Knowledge Graphs Play in AI?

August 27, 2024 QFlow Systems, LLC

In this episode of "The Interconnectedness of Things," QFlow Systems’ COO, Dr. Andrew Hutson, and VP of Engineering, Greg Romano, delve into the transformative power of AI and Large Language Models (LLMs) in data management.

They discuss how businesses can leverage AI to automatically organize and structure their data, creating a robust knowledge graph that captures and refines corporate knowledge over time. The conversation highlights the importance of maintaining data privacy while enabling customized AI-driven insights for individual businesses. The episode also touches on the limitations of generative AI in predicting future outcomes and underscores the need for more advanced, isolated LLMs that can benefit organizations without compromising security.

Dr. Hutson and Greg offer insights into how QFlow Systems is pioneering this approach, providing customers with tools to better manage their knowledge assets in a way that is both innovative and secure.

About "The Interconnectedness of Things"
Welcome to "The Interconnectedness of Things," where hosts Dr. Andrew Hutson and Emily Nava explore the ever-evolving landscape of technology, innovation, and how these forces shape our world. Each episode dives deep into the critical topics of enterprise solutions, AI, document management, and more, offering insights and practical advice for businesses and tech enthusiasts alike.

Brought to you by QFlow Systems
QFlow helps manage your documents in a secure and organized way. It works with your existing software to make it easy for you to find all your documents in one place. Discover how QFlow can transform your organization at qflow.com

Follow Us!
Andrew Hutson - LinkedIn
Emily Nava - LinkedIn

Intro and Outro music provided by Marser.

Welcome back to the interconnectedness of things, the podcast where we explore the latest in AI technology and how they intersect with business and everyday life. I'm your host, Emily Nava. And today, we have an insightful conversation between Qflow Systems COO, doctor Andrew Hudson, and our VP of Engineering, Greg Romano. This conversation was recorded during a real life meeting, where Greg entered our virtual office with Andrew and I. And, Andrew and Greg started talking about, large language models, diving deep into how those can transform the way businesses manage their data, creating knowledge graphs that capture and refine information over time.

So let's jump right in and see what they have to say. What did you say about knowledge graph dataset? Oh, well, so I'm hypothesizing that you can take in a corpus of existing unstructured data, so or a bunch of documents that you've had. But the difficulty a lot of folks will experience is creating their own labels and fields. How do I want the system to describe this repeatedly?

And if you take an LLM and put it against that, dataset, you should be able then to create if you limit the the confidence interval to something very, very high, very, very likely correlated, to create that initial set of labels and properties that's bespoke to that dataset, which could remove the burden of how do I get organized and shift it to it's already organized. How do I make it better? Mhmm. Greg, can you translate? Maybe.

Let's see. So what I'm hearing is that if we can run some LOL magic against this thing, we're going to be able to get the, terms, the the glossary, if you will, of a business. So That that's exactly right. We would use the the World LLM model that has the 1,000,000,000 405,000,000 parameters as far as llama 3.1 is concerned. And it would allow us to then say, hey.

Don't don't give me a a likelihood. Give me a a high likelihood. An absolute, if you will. Yeah. As close to certainty as you can as reasonably provide.

And if we talk about the the Gaussian distribution of a, you know, 5 or 6 sigma level of confidence that these terms accurately describe not only the the documents in and of themselves, but also the likely relationships between these documents, then we could reinforce or make it easier to begin the knowledge graph within our product based off of, in the a known, cohort of data. That's a lot of smart terminology you're throwing at me there, Hudson. Let's see if I can put that into my own speak. My just my bachelor degree speak from a state university. Using the magic of AI and large language models, We're thinking we could have a customer provide us with their data, perhaps in either documents or you may perhaps even spreadsheets.

But is that Yeah. Any any data, that they would want to upload. So just think like your your first time customer. Let's, like, let's take Google Drive. K?

So I'm in Google Drive. I got a bunch of documents. I wanna make sure they're backed up to the cloud. Mhmm. Cool.

I'm just gonna throw them up into the cloud. So we give the same treatment to our customers. They can throw it up into our product. Mhmm. The difference is when I'm in Google, it doesn't do anything intelligent after that.

Unless if it does anything intelligent, we don't get the benefit from it because Google's doing it on the back end to sell us ads. What we could do is And that's hyper inventory. Well, that's yeah. That's another I get that. I understand that.

Very good. But what we could do is take that data. We could have a localized LLM sitting in the back end and give the option to the end user to say, opt in to have us create your first, we use the term objects, your first object types and properties. So it could search through. It could find those.

It could structure them. We would set the confidence interval very high. We wouldn't give them the option. Mhmm. So that way, we would have the higher likelihood of if we presented something to them, it would likely match, and then they could opt in and say, yes.

Take that and apply it to those documents into my workspace. And then we allow them to modify and edit it over time. Potentially, there's even a okay. Now I'm really going off off script here. But, like, there is an option that this can happen, over time.

It's on a onetime event, but as the corpus of documents grows, so too do our suggestions on node labels and node properties. Mhmm. Are you thinking that improvement is happening for that existing customer over time, and we're going to, well, a term I would use would be refactor, but that's probably more of a, you know, developer speak. But more of a rearrange, rename, a a data model or terminology that they know because now we know more about their data? Or are you thinking this is more of improving it for the next customer and the next customer and the next customer?

No. For me, I think that we have to be really careful around data privacy because Yeah. In the new world, while a couple years ago, I would have said yes to your second proposition of, well, let all of our customers benefit from this learning. Mhmm. At this stage, I don't think we can get into the game where we're gonna go provide meaningful parameters to reinforce, a a current LLM.

That's gonna get faster, better. We just need to leverage it. Yeah. Nor do I really wanna be in the game of, you know, leaking customer data or be getting it out there into the world. Oh, absolutely not.

I would prefer not to be on the That's not the way we wanna be in the news. Right? No. I don't wanna be on the on the splash page of every major news article out there. Absolutely not.

At least not for that. Right. There's a the reasons we would, this is not one of those. Perhaps the interconnectedness of things, and it's the the hot new podcast that's trending. Yeah.

There you go. Okay. It's trending. Maybe that's where they wanna be. Yeah.

To to further that hypothesis, though, is if we isolate it to each workspace, that our customers would interface with, and it would be reinforcing or making suggestions based off of the use of that workspace. Mhmm. And we could introduce different or updated LLMs that would be localized and keep the data private, but that would be the extent to which we would interfere. Yeah. Now separately, if we want to do our own work that we would suggest, beginning templates based off a business type, I think that's what people do today.

If we look at other products that are out there, they don't have this reinforced approach to the LLMs yet, and instead, they spend a lot of time and money creating taxonomies and structures to to make it an easy button for their customer. But, invariably, if you go create, say, a taxonomy based off of loans and borrowers, that while there might be some overlap across industry, likely, you'll have different processes and have to modify it anyway. So it would be better to start from a position of what is it that you actually have and what are you doing. And then there is a potential there to say, do we offer benchmarks or baselines or practices that others are using if they wanna share that with the community? And then that's that's where you can start to get that that sharing of data, that socialization that happens with the knowledge management.

That's so critical that maybe that's the community that chooses to do that rather than we choose it on their behalf. Yeah. Because I know we had talked about quite a bit about building these sort of verticals or these templates. So, if you were a first time customer of Qflow, we would provide a list of some prebuilt data models, what I like to call data models, for your organization. So you could go in and say, oh, yeah.

I'm health care, or I'm a a a doctor's office or an attorney, or I'm doing case management. Mhmm. Mhmm. And then we would baseline you that would, you know, hopefully, give you some, some terms and names of things that would be recognizable to you. Right?

But, boy I still think that has its place. It does. But, boy, if we could leverage, some some compute to do that on our behalf, that that sounds I'm intrigued. I'm in. Yeah.

That How do I get one of these? Well, I mean, that's what we're bringing. That's that's what we're working on is is one, we fundamentally believe that every business needs a knowledge graph, period. Full stop. Yeah.

We also know that if you're in the knowledge business and more and more companies are, even if they think they're not, think about, you know, construction. There's plenty of knowledge that's given in a construction company that gets lost as soon as people retire. Right? Yeah. Every single business has some amount of knowledge and some amount of corporate amnesia.

So they would be benefited by a knowledge graph to capture, retain, build, and improve the connections to what's learned and how to operate. And so we fundamentally believe everybody needs it. The problem is people don't know what it is, and it could be too esoteric to build themselves. Yeah. This has got this has got me thinking that maybe I could build a knowledge graph of my own.

And when my kids go off to school, I could just hand it to them, and then they could just ask it. What what would dad think? Mhmm. I mean, I'm getting ready to go out tonight, and this is what I'm wearing. What would dad say?

And they could just take a picture of it and plug it into the app, and it would say, you know, you look nice tonight. Yeah. I think it it could might it it could get cool. Now you get that emotional feeling too. Right?

And then we might get down to, like, uploaded intelligence at that point or all the way to, like, Superman in 1978 with, you know, Jor Elle coming out and talking to him and getting him educated. Like, that that's nice to think about. It is. Yeah. That that's absolutely an awesome future.

I I think, I think for the next decade, though, before we're able to get that uploaded intelligence feasible Right. Is is allowing companies to create these knowledge graphs and and alongside LLMs that are isolated from, non govern non, non corporate actors. So that would be anybody outside of your company, so that you could benefit from the parameters and the models and the compute that was given over a decade or a decade or so Mhmm. Without having to incur that cost yourself, but you get the benefits from. And now that more and more of those are going open source, the the generative AI is gonna peak pretty quick because people are gonna see more and more that while generative AI is a probabilistic model that goes through several transformers to guess what the next right word part is.

It's not very good at predictive intelligence. And I know that that might be counterintuitive, but that's the generative AI is gonna peak is is the story because people still wanna be able to predict what will happen. Yeah. I read a Medium article, recently that was speaking about the Sabahai. It these AI engines are not really good at predicting the future, because they don't have that that human reasoning about them.

I would love to pull up that article and reread what that was. Well, there's a reason for it. Right? It's because it's based off of a probabilistic approach. Right.

And these these transformers that, so literally, the GPT stands for generated, predictive transformers. Now you might think that that means it's predicting the future. It's not. It's guessing based off of a corpus of previous data what is likely the next word part to produce to you. Mhmm.

Well, if you think about that, that's not intelligence. That's smart guessing. And we talk about Yeah. The future, and we wanna try to predict what will happen and when, the Gen AI isn't a good, candidate for that type of work. And instead, there's actually been years of predicting and preventing issues with data.

And so if I stop saying the AI for a moment, I'll give you a story. My my brother is a reliability engineer. Now he's an executive, and he goes around, and he's kinda boring. But when he was doing cool stuff, he was going around and placing sensors on manufacturing lines so that he could use a model to determine when they should schedule maintenance to prevent the line from going down due to some faulty part. And he could do this by putting the sensors in about 200 different locations and determining the vibration of the manufacturing line.

And if the vibration wasn't right, he could determine which part was causing that so that he could schedule the maintenance at a desirable time versus in the middle of a big push to actually get something through the line itself. And that he was explaining to me how that saves 1,000,000 of dollars a day being able to predict when is the right time to schedule maintenance. Coolest stuff ever. That was 10 years ago. And so more and more of that is coming into play that has nothing to do with, do I get to talk to a chatbot and does it sound like Scarlett Johansson?

Yeah. It got me to thinking about a podcast that I'm listening to, and this is kind of a a dark side to take it to. But this is where a where, software and AI goes wrong. It has to do with the Boeing Max 8. Oh, yeah.

And how they were talking about, you know, how that software that was built to make sure that the plane wasn't, the pitch wasn't too high or too low, kicked in and did the wrong thing with 1 sensor failing. It's taking in all this information from all these spots, from all these different sensors. Mhmm. All this data is coming into this software, and then one, it had this critical failure, and I've only listened to the first episode of a multiepisode podcast. So I don't know all the details.

But it was, we're doing a a podcast plug that we're doing? Well, I'm not gonna name it because I'm Well, yeah. They gotta they gotta pay for that. We're not Yeah. I don't know why we were going to give away something like that.

We're already giving away this AI engine that's gonna go out and build a company's, you know, taxonomy for them. So Yeah. We gotta be careful on how much I give away here. Well, that's right. To this and and their get their hands out.

You know? Those models that come in, like, with the Boeing Bats, that that's seriously, the the danger of too much reliance on a model without enough experience to counter it. Yeah. And I think that's what we've kind of been experiencing at least the last 5 years across multiple industries. This whole AI thing has been coming in, but we're really starting to see the the impact of the poor decisions that it produced.

Right. Even to the craziest one I heard about is in San Francisco on the news this morning, residents were complaining about noise issues, and it was all of these driverless cars that were honking at each other. I did see that. And, apparently, they were honking at each other because if you are backing if that v driverless vehicle is backing up in reverse, it will stop if another car is coming towards it. But if that same driverless vehicle is now a second one and it's coming towards you, so the driverless car was backing up, and then the driverless car was driving towards the car backing up.

And so now they both stop, and then they start trying to creep forward and then keep stopping. And, eventually, the one of them starts honking at the other, but they're both stopped. So now extrapolate out this not to 2 of these driverless vehicles, but a whole fleet going into a parking garage and honking at each other at 3 in the morning. Outside an apartment complex? Yes.

Wild. Yeah. That's that's crazy. My mind immediately goes to, how would I solve that problem in software? That's that's ridiculous.

How would you solve that problem? Yeah. I don't have any idea how I would solve that problem. I'd probably, start over. Yeah.

Yeah. Well, how do humans solve it? I would what now? How do us as humans, we've been in that same situation. How do we solve it?

You roll your window down and you throw your arm out. As an option. Yep. Yeah. So you would communicate, or you would, back off.

Yeah. Right? So if someone's coming in, they're gonna cut you off. You're like, you know, okay. Fine.

I'm just gonna grip my teeth and bear it, or I'm gonna flip you the bird, or I'm gonna choose to run into you because I got more insurance than you. Right? Whatever decision is made, it all has to do with, the the outcome that is desired by each party and then some type of communication that happens. So the fastest way to solve it is if all these autonomous vehicles were communicating, connected, and aware of all the the other autonomous vehicles so that they would know if that situation happens again. They're like, oh, well, if you're reversing, you stop, and the one that was driving, go on by.

So I don't know anything about how these autonomous vehicles I guess they don't really communicate with each other. Right? That that must not be a thing. So it's just the sensors on how close something is in the direction I'm heading and how fast I'm heading there in relation to the other thing around me. And that's that really isn't communication.

That's just observing Mhmm. What's happening around me. Right? As far as I know, that's right. And that keeps that's why there's so many issues with the driver's vehicles, especially at high speeds.

There was a recent news video, from Vox around Teslas and how their computers are like black boxes and really difficult to pull out. But they have this proprietary data that if you have a driver, you can actually show everything that the computer was predicting before it had a crash. Mhmm. And it is unable to interpret what's going on around it like a human, excuse me, a human would because the computer vision is simply calculating and identifying things around it with some basic logic, but it's not really able to predict the impact of the previous decision that it might have made. So the previous decision is I'm gonna go 60 miles an hour, and I'm gonna go into a new lane to avoid this other vehicle.

K? But it doesn't know that it did that, and it can't understand why that other vehicle may have gone to that other lane. And in the case of the accident, the vehicle went to the other lane because there was construction going on in front of it, and it was dark and not well lit. And it the computer vision didn't pick it up, and it drove 60 miles an hour into the construction zone. Mhmm.

The so the we're getting off a little tangent here about all these different areas and applications of AI. But before I I look into it and I read about it, the less apprehensive I become about it's gonna replace my job. Because Yeah. It's a process. I'm really worried about that because, I have a much higher level thinking.

While I don't have all that, quote, unquote, data there that I can quickly recall, I'm smarter than what chat b g t what chat GPTs put now in a lot of ways. And that's the key is humans are still able to reason better than computers. Yeah. I mean, I've just been using it just to provide me, you know, hey. I need to write a function that, you know, does a query against the the graph database matching on this criteria and returning these results.

It does a relatively good job with that. I can look that up pretty quickly, and I could look at reference guides, documentation and piece that together pretty quickly. However, if I get a little more complicated with that logic, it goes off the rails. You know, the hallucinations kick in, as you say. You're like, well, that's not close at all what I really wanted to get out of you, AI engine.

And then But you know that because I do know that. Yeah. And that's that's what it can't seem to replace. And and I'll say yet, maybe there is a future where it will become better. You know, we've seen on certain workloads how the rapid improvement of the models over time.

I I still think, though, it's the quality of the use, and I kinda like like we all like to talk about Steve Jobs. Steve Jobs and his, recollection of reading Science Magazine and, the human and a bicycle. Do you remember the story? I don't remember the story. So he was looking at the locomotive efficiency of different mammals and ant or different animals in the animal kingdom.

Mhmm. And at the very, very top efficiency was a condor. So it was using the least energy or the most, distance and and output. Right? And humans were, like, 14th on the list when we're running.

However, if you put us the same human on a bicycle Right. We crush the condor. And that sparked Steve Jobs' imagination because he started to think about that's that's why humans are different is we can use tools to become more efficient. And if we think about a computer in his words Mhmm. He saw a computer as a bicycle for the mind.

Yeah. Which I thought was pretty cool. And I when I look at AI and I look at these tools, I always think about it of how is it advantaging the user. Mhmm. And I think so many people are are talking, and even the news is saying this, of of feeling replaced by these tools.

But at the end of the day, the human is the judgment. The human is the arbiter. The human is the shepherd of these tools and how they're going to be applied and the quality in which they can produce. If that isn't what's focused on and honed, you're gonna get a generation of people like that damn Idiocracy movie where they are just helpless on what is happening to them from these tools. You you know what I'm honing in on and I'm focused in on right now?

What are the other animals that have the high, efficiency? You've got my brain turning there. Well, I really believe that, you know, the condor I I like the condor. That that's a good one. But then it made me immediately think about the albatross.

Right? You know about the albatross? You know what I mean? About the golf term. Yeah.

Well, let me know when you get one of those, and we'll we'll talk more. Okay. But the albatross, right, it's a bird, large bird like the condor. And I don't know all the details exactly, but I know that it flies, I think, for, like, perhaps weeks or months on end over the open ocean. Like, I don't think it like, it can go a very long time.

We should look it up right now. We should ask chat GPT. How long does an albatross stay out over open water? I don't know. It's a long time.

Albatrosses. I'm just gonna look it up real quick. Among the largest flying birds. How big? Where's the years.

5 years? Learning to land. There you go. 5 years. So it has to be pretty dang efficient.

I don't understand how the Condor is more efficient at using what was what'd you say? And I I think you're missing the story. Like Well, I just I've just got me thinking about the most efficient animals that are out there. I I got your story where Okay. The the bicycle makes a human a lot more efficient, and it's a tool.

Right? And the computer is a tool for the mind. I heard you. I was listening. Alright.

I'm able to repeat what I heard. You know? It's good stuff. A great place to end. Tune in next week as we discover more about Byrd Law.

And if if a seagull or any gulls really would pierce your eardrums if you own them as a pet. I'm gonna go out and look out look up for, the animals 2 through 13 on the efficiency list and, see if I agree with it or not. Alright. Should I distract long enough, Emily, for you to finish all your stuff? Alright.

That's it for today's episode of the interconnectedness of things. A big thank you to doctor Andrew Hudson and Greg Romano for sharing their expertise on knowledge graphs and the power of AI and data management. If you found this discussion valuable, don't forget to subscribe, leave a review, and share it with your network. Stay tuned for more conversations on how technology is shaping our world. Until next time, stay curious and stay connected.