Glean,AI+企业搜索的创新者

Glean,AI+企业搜索的创新者

麦肯锡的一项调查显示,典型的知识工作者有超过四分之一的时间花在搜索信息上。这里的问题是显而易见的。公司希望他们的员工花时间做能让企业最成功的事情,而不是费时间找东西。

但解决方案很少,原因是这是一个很难解决的问题。首先,每个企业都是不同的,都有自己独特的信息、应用程序、技术堆栈和员工背景。其次,直到最近十年,大多数应用程序才支持 API。在此之前,尝试与消息应用程序集成是非常困难的,甚至不可能。

现在,由于快速变化的技术环境和人工智能的进步,像 Glean 这样的解决方案成为可能。Glean 是一家生产力初创公司,通过使用 100 多个 API 索引和理解数十种产品的文档上下文,开发了智能企业搜索助手。随着信息复杂性的增加,Glean 为知识工作者提供类似 Google 的体验,以更有效地搜索内容和员工能力。该公司正在彻底改变现代团队查找内部信息的方式。通过提供公司信息的集中存储库,员工可以快速找到工作所需的任何文档、信息或人员。其最新估值达到 22 亿美元,并得到 Kleiner Perkins、Lightspeed 和 Sequoia 等公司的支持。其客户名单包括 Databricks、Duolingo 和 Grammarly 等。

🖼️
图片 2 1920 × 960px

📷 图片包含在完整版文章中,点击文末按钮查看

🖼️
图片 3 414 × 137px

📷 图片包含在完整版文章中,点击文末按钮查看

Source: Glean 资料来源:Glean

Arvind Jain (CEO) launched Glean in 2019 after spending over a decade at Google as an engineer, where he led teams in Google’s Search, Maps, and YouTube products. Earlier in his career, he held leadership positions at Akamai and Microsoft. Jain previously co-founded Rubrik*, a data security business that reached unicorn status in just over a year. While leading Rubrik, Jain and his team utilized a tech stack made up of 300+ cloud applications. With data scattered across so many pieces of software, Jain found his own productivity stunted by the time spent locating the right information. In annual employee surveys at Rubrik, Jain observed that finding information was the biggest productivity challenge for people. That served as the inspiration for Glean.

与当今许多成功的科技公司一样,Glean 的诞生源于工作中经历的真正挑战。

Arvind Jain 是 Glean 的 founder 和 CEO,毕业于 Indian Institute of Technology, Delhi 和美国华盛顿大学。Arvind 从小对创业有很大热情,在创立 Glean 之前,Arvind Jain 拥有科技大厂、初创公司和联合创业的经历,同时技术水平较强。Arvind 曾在 Microsoft 担任 Software Engineer,后在互联网蓬勃发展以及创业热潮中加入了初创企业 Akamai Technologies 担任了 3 年的 Architect。

在 Akamai 之后,Arvind 作为 founding engineer 加入 Riverbed Technology,积累了如何从零开始建立一家企业的经验。

Arvind Jain 自 2003 年起在谷歌工作了 11 年,担任 Distinguished Engineer,领导谷歌搜索、地图和 YouTube 产品团队。2014 年,Arvind 离开 Google,联合创立了 Rubrik,Rubrik 是云数据管理领域发展最快的公司之一。

在领导 Rubrik 时,Jain 和他的团队利用了由 300 多个云应用程序组成的技术堆栈。由于数据分散在如此多的软件中,贾恩发现自己的工作效率因查找正确信息所花费的时间而受到阻碍。在 Rubrik 的年度员工调查中,Jain 观察到,查找信息是人们生产力面临的最大挑战。

这成为Glean的灵感来源,2019 年,Arvind 创立了 Glean。

Jain recruited a group of Google and Facebook veterans including T.R. Vishwanath, who now leads Glean’s infrastructure teams, Tony Gentilcore, who leads Glean’s product engineering team, and Piyush Pradladka who led Glean’s search efforts until July 2023 when he left the company. Prior to Glean, Vishwanath spent nearly a decade as a Principal Software Engineer at Facebook, working on areas like News Feed ranking, Ads, and Developer Platform. Pradladka spent several years at Uber and over a decade at Google working on a variety of things including a long stint in Google's core search ranking team. Gentilcore spent over a decade at Google, helping modernize the web search interface and leading Chrome’s Speed Team.

Glean 的创始团队很多曾在 Google 工作过。在那里,他们有幸使用了 Moma——一个定制的内部网,可以对 Google 内部使用的所有内容进行索引。

因此,Glean 要解决的问题也非常明确:在公司内部构建一个类似 Google 的搜索系统,可以集成数百个不同 SaaS 应用程序的信息。

Jain 招募的团队包括 T.R. Vishwanath 现任 Glean 基础设施团队的负责人,Tony Gentilcore 负责 Glean 产品工程团队的负责人,Piyush Pradladka 负责领导 Glean 的搜索工作。在加入 Glean 之前,Vishwanath 在 Facebook 担任了近十年的首席软件工程师,从事新闻推送排名、广告和开发者平台等领域的工作。 Pradladka 在 Uber 工作了几年,在 Google 工作了十多年,曾长期在 Google 核心搜索排名团队工作。 Gentilcore 在 Google 工作了十多年,帮助实现网络搜索界面的现代化并领导 Chrome’s Speed Team。

Jain and team launched Glean in 2019 out of an incubation space in Kleiner Perkins’ Menlo Park office, choosing to build in stealth for several years before officially launching in 2021.

Jain 和团队于 2019 年在凯鹏华盈门洛帕克办公室的孵化空间中推出了 Glean,并选择秘密建设数年,然后于 2021 年正式推出。

🖼️
图片 4 414 × 273px

📷 图片包含在完整版文章中,点击文末按钮查看

Source: Glean 资料来源:Glean

Glean is a unified search product that indexes dozens of applications, understanding context, language, behavior, and employee relationships, to find personalized answers to questions. The product is built on company knowledge and content, with permissioning and data governance in mind. Glean is a layer on top of a company’s software applications that users can engage with through a web app, new tab page, sidebar search, native search, or Slack demand. To provide its core features, Glean retrains language models on a company’s unique knowledge base to develop thorough understanding of content, language, people, and relationships.

Glean 是一款统一搜索产品,可以对数十个应用程序进行索引,了解上下文、语言、行为和员工关系,从而找到问题的个性化答案。该产品建立在公司知识和内容的基础上,并考虑到许可和数据治理。 Glean 是公司软件应用程序之上的一层,用户可以通过网络应用程序、新标签页、侧边栏搜索、本机搜索或 Slack 需求进行交互。为了提供其核心功能,Glean 在公司独特的知识库上重新训练语言模型,以培养对内容、语言、人员和关系的透彻理解。

得益于 Google 在开放领域提供的新 Transformer 技术和大型语言模型,Glean 团队可以做一些革命性的事情:生成嵌入并构建语义搜索。在 2019 年,这几乎是闻所未闻的。产品内部看起来是从根本上增强的用户体验和准确的搜索结果。因此,如果工作人员在 Glean 中输入“向我展示 X 的产品手册”,该技术将显示 X 的用户指南、X 的团队手册、X 的产品手册以及基于语义搜索匹配的任何其他内容,而不仅仅是关键字匹配。

这种革命性的知识检索方法使 Glean 比许多竞争对手更具优势,当时这些竞争对手通常使用基于 QR 的搜索和传统信息检索 (IR) 技术。

🖼️
图片 5 414 × 258px

📷 图片包含在完整版文章中,点击文末按钮查看

Source: Lightspeed 来源:光速

Glean’s technology surfaces personalized results based on the user’s role and relationships. In April 2023, Glean launched several generative AI features to streamline search results, such as AI answers, which generates a single concise answer to a query.

Glean 的技术根据用户的角色和关系呈现个性化结果。 2023 年 4 月,Glean 推出了多项生成式 AI 功能来简化搜索结果,例如 AI answers,它可以针对查询生成一个简洁的答案。

🖼️
图片 6 414 × 274px

📷 图片包含在完整版文章中,点击文末按钮查看

Source: Glean

In June 2023, Glean launched Glean Assistant, a ChatGPT-powered chatbot that can leverage the existing corporate knowledge graph that Glean leverages for search to provide concise responses to user questions. Glean Assistant can provide direct questions to queries, or attempt to provide a response as part of a search on Glean’s broader platform.

2023年6月,Glean推出了Glean助手,这是一个由ChatGPT驱动的聊天机器人,可以利用Glean用于搜索的现有企业知识图谱,为用户提供简洁的回答。Glean助手可以直接回答问题,也可以尝试作为Glean更广泛平台上搜索的一部分提供响应。

🖼️
图片 7 414 × 157px

📷 图片包含在完整版文章中,点击文末按钮查看

Source: Glean 资料来源:Glean

Glean can curates collections of information, aggregates verified answers, and enable managers to create and share “Go Links” to help employees navigate to common collections of resources. For example, a company can assemble a “product-roadmap” Go Link so that users can easily access all the necessary internal resources to create a new product roadmap.

Glean 可以整理信息集合,聚合经过验证的答案,并使管理人员能够创建和共享“转到链接”,以帮助员工导航到常见的资源集合。例如,公司可以组装“产品路线图”Go Link,以便用户可以轻松访问所有必要的内部资源来创建新的产品路线图。

🖼️
图片 8 414 × 274px

📷 图片包含在完整版文章中,点击文末按钮查看

Source: Glean 资料来源:Glean

Glean has built out a “home page” that serves as the go-to spot for employees to find AI-generated recommendations for documents to read through or people with relevant knowledge. This home page includes company announcements, employee directories, a calendar, and multiple widgets to help supercharge employee collaboration like related people, recent work, and availability to meet.

Glean 建立了一个“主页”,作为员工查找人工智能生成的建议以供阅读的文档或具有相关知识的人员���首选地点。该主页包括公司公告、员工目录、日历和多个小部件,以帮助增强员工协作,例如相关人员、最近的工作和可以见面的时间。

🖼️
图片 9 414 × 245px

📷 图片包含在完整版文章中,点击文末按钮查看

Source: Glean 资料来源:Glean

A critical piece of Glean’s ability to integrate across an organization's various sources of knowledge is its ability to connect with those disparate systems. Glean has built over 100 connectors to different systems that customers rely on. Glean’s ability to expand into larger organizations is significantly limited if the product is unable to ingest knowledge information from a particular core system.

Glean 整合组织各种知识源的能力的关键是它与这些不同系统连接的能力。 Glean 已为客户所依赖的不同系统构建了 100 多个连接器。如果产品无法从特定核心系统获取知识信息,Glean 扩展到更大组织的能力将受到极大限制。

🖼️
图片 10 414 × 60px

📷 图片包含在完整版文章中,点击文末按钮查看

Source: Glean 资料来源:Glean

Glean began by targeting technology companies with between 500 and 2K employees, who were still able to move fast yet experiment. According to an interview with Outreach, a customer of Glean, Glean adds the most value to organizations that have passed an initial setup phase and are now entering into a growth phase with ~100 employees and more. As one example, Confluent is a Glean customer that quickly grew from 250 to 2K+ employees. Confluent believes that Glean has enabled Confluent employees to save 15K+ hours per month as Glean users.

Glean 首先瞄准了拥有 500 到 2,000 名员工的科技公司。举个例子,Confluence 是 Glean 的客户,其员工人数迅速从 250 人增加到 2,000 多人。 Confluence 认为,Glean 使 Confluence 员工作为 Glean 用户每月可以节省 15,000 多个小时。

The global enterprise search market size was valued at $4.2 billion in 2022 and is expected to expand at a CAGR of 8.9% from 2023 to 2030. This growth is largely due to the increasing prioritization of breaking down information silos within organizations, especially as the number of applications used within each company grows. Advancements in large language models could also further increase the capabilities of enterprise search, which could significantly increase the addressable market size. As more data moves into the cloud, hosted search is expected to be the fastest-growing segment in this market.

2022 年,全球企业搜索市场规模估值为 42 亿美元,预计 2023 年至 2030 年复合年增长率为 8.9%。这种增长主要是由于打破组织内部信息孤岛的优先级越来越高,特别是随着每家公司内使用的应用程序数量都在增长。大语言模型的进步还可以进一步增强企业搜索的能力,从而显著增加潜在市场规模。随着越来越多的数据转移到云端,企业搜索预计将成为该市场中增长最快的部分。

While an estimated 61% of companies use some kind of enterprise search functionality, the biggest obstacle is the limited scope of each search tool. Most enterprises have multiple knowledge databases (e.g. Jira, Google Docs, etc.) but search functions are limited to each individual platform.

虽然估计有 61% 的公司使用某种企业搜索功能,但最大的障碍是每种搜索工具的范围有限。大多数企业拥有多个知识数据库(例如Jira、Google Docs等),但搜索功能仅限于各个平台。

The challenge of enterprise search has been a well-established problem that a number of companies have tried to solve. In 2007, a company called Powerset raised $12.5 million from investors like Peter Thiel and Luke Nosek, founders of PayPal, and Reid Hoffman, founder of Linkedin. In 2008, the company was acquired by Microsoft for $100 million. Originally, technology like Powerset was expected to become an enterprise search juggernaut built around Microsoft’s SharePoint collaboration platform. However, Barney Pell, the founder of Powerset, focused primarily on Bing, and much of the excitement around search in the early 2010s was more focused on consumer use cases, rather than enterprise.

在生成式人工智能应用中,搜索是一个备受关注的市场,无论是对消费者还是对企业都很有吸引力。十多年来,搜索用户体验和行业格局几乎没有发生重大变化或颠覆。因此,创业公司试图优化和改变搜索方式,并占据市场份额。

企业搜索的挑战一直是许多公司试图解决的一个既定问题,因此该赛道一直比较拥挤,主要玩家包括微软、Google、Amazon、IBM、Oracle 等大型科技企业,以及专注做企业搜索的公司,如 Coveo、Lucidworks、Glean、Mindbreeze 等,这其中有像 Glean 这样新成立的公司,也有像 Coveo 这样已经成立十几年的公司。

🖼️
图片 11 414 × 355px

📷 图片包含在完整版文章中,点击文末按钮查看

Source: Gartner 资料来源:高德纳

Microsoft: Microsoft has continued to demonstrate industry leadership in enterprise search with a number of different solutions. SharePoint Syntex was launched in 2020 and made available to all Microsoft 365 users in 2022. The product offers content AI integrated into user workflows in order to automatically add tags, and index high volumes of content, so users can search effectively. Unlike Glean, SharePoint doesn’t connect with every information source within a company, limiting its access to Office 365. Within Microsoft’s Azure cloud service, the company provides Azure Cognitive Search, an information retrieval system within a customer’s web applications and data, both for internal enterprise use cases and external website or ecommerce search.

微软:微软通过多种不同的解决方案继续展示了企业搜索领域的行业领先地位。 SharePoint Syntex 于 2020 年推出,并于 2022 年向所有 Microsoft 365 用户提供。该产品将内容 AI 集成到用户工作流程中,以便自动添加标签并索引大量内容,以便用户可以有效搜索。与 Glean 不同,SharePoint 并不与公司内的每个信息源连接,从而限制了其对 Office 365 的访问。在 Microsoft 的 Azure 云服务中,该公司提供 Azure 认知搜索,这是客户 Web 应用程序和数据中的信息检索系统,用于内部企业用例和外部网站或电子商务搜索。

Google Cloud Search: Google announced Google Cloud Search in 2017 as a search platform spread “across G Suite products, including Drive, Gmail, Sites, Calendar, Docs, Contacts and more.” Originally, the product was focused on being seamlessly integrated across Google Workspace apps, but increasingly Google has launched similar connections to external platforms as other enterprise search offerings. The product includes connections to GitHub, Confluence, Jira, and Slack among others.

谷歌云搜索:谷歌于 2017 年宣布推出谷歌云搜索,作为“跨 G Suite 产品的搜索平台,包括云端硬盘、Gmail、协作平台、日历、文档、联系人等”。最初,该产品的重点是跨 Google Workspace 应用程序无缝集成,但谷歌越来越多地推出了与其他企业搜索产品类似的与外部平台的连接。该产品包括与 GitHub、Confluence、Jira 和 Slack 等的连接。

Amazon: In 2020, Amazon announced the release of Amazon Kendra, an enterprise search platform that enables users to ask contextual questions and search across silos for relevant information, both within Amazon’s ecosystem (e.g., S3) and external (Salesforce, Slack, etc.) Amazon also offers Amazon CloudSearch, a cloud-based search service that is primarily focused on external use cases like a website or ecommerce store, and Amazon OpenSearch Service, derived from Elasticsearch, which is primarily for application performance review, rather than knowledge management.

亚马逊:2020 年,亚马逊宣布发布 Amazon Kendra,这是一个企业搜索平台,使用户能够在亚马逊生态系统(例如 S3)和外部(Salesforce、Slack 等)内提出上下文问题并跨孤岛搜索相关信息。 )亚马逊还提供 Amazon CloudSearch,这是一种基于云的搜索服务,主要关注网站或电子商务商店等外部用例,以及 Amazon OpenSearch 服务,源自 Elasticsearch,主要用于应用程序性能审查,而不是知识管理。

Elastic: Founded in 2012 by the creators of the popular open-source project Elasticsearch, Elastic provides software products for developers, startups, and enterprises to make massive amounts of complex structured and unstructured data usable. By focusing on scalability, ease-of-use, and ease of integration, Elastic’s products are used for real-time search, logging, analytics, and security to power internal and external applications for organizations like Cisco, eBay, Goldman Sachs, and Groupon. Elastic went public in June 2018 after raising a total of $162 million from investors like Benchmark, Index, and NEA. Elastic mostly works on the back end of enterprises, powering external application interfaces without users ever realizing how their searches are being executed, and has not shown a desire yet to compete with Glean’s vision to become the Google of internal company data search.

Elastic:Elastic 由热门开源项目 Elasticsearch 的创建者于 2012 年创立,为开发人员、初创公司和企业提供软件产品,使大量复杂的结构化和非结构化数据变得可用。通过专注于可扩展性、易用性和集成性,Elastic 的产品用于实时搜索、日志记录、分析和安全性,为 Cisco、eBay、Goldman Sachs 和 Groupon 等组织的内部和外部应用程序提供支持。 Elastic 于 2018 年 6 月上市,从 Benchmark、Index 和 NEA 等投资者筹集了总计 1.62 亿美元的资金。 Elastic 主要工作在企业后端,为外部应用程序接口提供支持,而用户却没有意识到他们的搜索是如何执行的,并且尚未表现出与 Glean 成为内部公司数据搜索领域的 Google 的愿景竞争的愿望。

Coveo: Founded in 2005 in Quebec, Canada, Coveo is an AI search solution for ecommerce, websites, customer service, and workplaces. Unlike Glean, which solely focuses on internal enterprise search, Coveo’s product suite includes ecommerce and website offerings as well. Coveo raised a $172 million funding round in November 2019 at a valuation of $1.1 billion led by Omers Capital Private Growth Equity Group. In total, the company has now raised just over $402 million with this round from investors like Evergreen Coast Capital, FSTQ, and IQ Ventures.

Coveo:Coveo 于 2005 年在加拿大魁北克成立,是一家面向电子商务、网站、客户服务和工作场所的人工智能搜索解决方案。与仅专注于内部企业搜索的 Glean 不同,Coveo 的产品套件还包括电子商务和网站产品。 Coveo 于 2019 年 11 月筹集了 1.72 亿美元的融资,估值为 11 亿美元,由 Omers Capital Private Growth Equity Group 领投。该公司目前已从 Evergreen Coast Capital、FSTQ 和 IQ Ventures 等投资者手中筹集了超过 4.02 亿美元的资金。

Vectara: Founded in 2022 by Amr Awadallah, who previously worked at Google and co-founded Cloudera, Vectara is a generative conversational search platform that seeks to provide a ChatGPT-like experience for business users looking to engage with their internal data. The company raised a $28.5 million seed round in June 2023.

Vectara:Vectara 由 Amr Awadallah 于 2022 年创立,他曾在 Google 工作,是 Cloudera 的联合创始人。Vectara 是一个生成式对话搜索平台,旨在为希望与内部数据互动的商业用户提供类似 ChatGPT 的体验。该公司于 2023 年 6 月筹集了 2850 万美元种子轮融资。

Guru: Founded in 2013, Guru is an enterprise knowledge management platform meant to act as an employee intranet with guidance to the most important resources within an organization. The company provides intelligent search, AI-powered answers, and automated in-context knowledge. The company raised a $30 million Series C in April 2020 from Accel, bringing its total funding amount to ~$71 million.

Guru:Guru 成立于 2013 年,是一个企业知识管理平台,旨在充当员工内联网,指导组织内最重要的资源。该公司提供智能搜索、人工智能驱动的答案和自动化的上下文知识。该公司于 2020 年 4 月从 Accel 筹集了 3000 万美元的 C 轮融资,使其总融资额达到约 7100 万美元。

Neeva: Traditionally, enterprise search companies are focused on enabling users to search for information across their employer’s internal knowledge databases. Some products, like Azure Cognitive Search or Amazon CloudSearch, are focused on enabling search functionality within an existing cloud ecosystem. Neeva is a company that had previously been focused on consumer search before announcing a pivot to AI-powered enterprise search on May 21st, 2023. Just three days after the announced pivot, Snowflake announced that it had acquired Neeva for an undisclosed amount. Neeva is expected to provide a higher-quality search capability within Snowflake’s cloud ecosystem. As Benoit Dageville, the co-founder of Neeva, explained: “The ability for teams to discover precisely the right data point, data asset, or data insight is critical to maximizing the value of data.”

Neeva:传统上,企业搜索公司专注于让用户能够在雇主的内部知识数据库中搜索信息。某些产品(例如 Azure Cognitive Search 或 Amazon CloudSearch)专注于在现有云生态系统中启用搜索功能。 Neeva 是一家此前专注于C端搜索的公司,后来于 2023 年 5 月 21 日宣布转向人工智能驱动的企业搜索。就在宣布转向三天后,Snowflake 宣布以未公开的金额收购了 Neeva。 Neeva 预计将在 Snowflake 的云生态系统中提供更高质量的搜索功能。正如 Neeva 联合创始人 Benoit Dageville 所解释的那样:“团队准确发现正确数据点、数据资产或数据洞察的能力对于最大化数据价值至关重要。”

Glean operates on a subscription model with a traditional enterprise sales motion, where they do not have individual users sign up for Glean and start using it. The team has not standardized pricing yet and relies on custom pricing after a free pilot of Glean for 30 days. One interview indicated that early customers of Glean like Outreach paid a flat rate of ~$50K per year, regardless of the number of employees using the software.

Glean 商业模式为纯 toB 的模式,未向个人用户开放。Glean 通常为企业提供两种定价模型,一是 per-seat 的定价模型,每个 seat 每月 100 美金以内;二是针对企业级解决方案的个性化定价模型。

In 2022, Glean users created 7 million searches with a satisfaction rate of 85%. Although Glean has not publicly disclosed revenue numbers, in May 2023 it revealed that it was serving over 100 customers, including notable companies like Uber, Extrahop, Grammarly, Outreach, and Confluent. As of July 2023, Zoominfo estimates that Glean’s revenue currently stands at $14.7 million.

2022 年,Glean 用户创建了 700 万次搜索,满意度为 85%。尽管 Glean 尚未公开披露收入数据,但它在 2023 年 5 月透露,它正在为 100 多家客户提供服务,其中包括 Uber、Extrahop、Grammarly、Outreach 和 Confluence 等知名公司。截至 2023 年 7 月,Zoominfo 估计 Glean 的收入目前为 1470 万美元。

🖼️
图片 12 1080 × 247px

📷 图片包含在完整版文章中,点击文末按钮查看

In May 2022, Glean raised a $100 million Series C led by Sequoia at a $1 billion valuation, with participation from existing investors including Quentin Clark of General Catalyst, Mamoon Hamid of Kleiner Perkins, Ravi Mhatre of Lightspeed, and Slack Fund. This funding round brought the company’s total funding to $155 million.

2019 年 3 月,在 A 轮融资中从 Kleiner Perkins 等投资者处筹集了 1.5 千万 美元;

2021 年 3 月,在 B 轮融资中从 General Catalyst 等投资者处筹集了 4 千万 美元;

2022 年 5 月,Glean 筹集了由红杉资本领投的 1 亿美元 C 轮融资,估值为 10 亿美元,现有投资者包括 General Catalyst 的 Quentin Clark、Kleiner Perkins 的 Mamoon Hamid、Lightspeed 的 Ravi Mhatre 和 Slack Fund 等。本轮融资使公司总融资额达到1.55亿美元。

2024年2月,完成2亿美元D轮融资,估值22亿美元。投资方为Kleiner Perkins、Lightspeed Venture Partners 、 Sequoia Capital、Coatue、ICONIQ Growth、IVP 和 Capital One Ventures。该公司累计融资约3.6亿美元。

With rapid advancement in LLM models like GPT-4, the practicality of an enterprise AI assistant becomes more probable. The vision for this kind of product would be an AI assistant that takes goals or tasks and leverages AI to find answers, or even progress towards completing tasks. In one estimate, the market for business process automation technologies — technologies that streamline enterprise customer-facing and back-office workloads — will grow from $9.8 billion in 2020 to $19.6 billion by 2026. This could be a natural addition to Glean’s current core enterprise search product.

随着GPT-4等LLM模型的快速发展,企业AI助手的实用性变得更加可能。此类产品的愿景是一个人工智能助手,它可以接受目标或任务,并利用人工智能来寻找答案,甚至逐步完成任务。据估计,业务流程自动化技术(简化企业面向客户和后台工作负载的技术)市场将从 2020 年的 98 亿美元增长到 2026 年的 196 亿美元。

Glean is uniquely positioned because it has access to the entire corpus of a company’s internal data. Having built out the employee knowledge graph, Glean now knows how every employee is connected to each other. Similar to how Rippling has built out a suite of tools from its employee knowledge graph, Glean can leverage its platform position and source of truth within an enterprise to potentially build out knowledge products in specific use-cases like HR, payroll, app management, device management, and other tools, expanding its potential market size.

Glean 具有独特的地位,因为它可以访问公司内部数据的整个语料库。构建了员工知识图谱后,Glean 现在知道每个员工如何相互联系。与 Rippling 从其员工知识图谱构建一套工具类似,Glean 可以利用其平台地位和企业内的事实来源,潜在地在特定用例(如人力资源、薪资、应用程序管理、设备)中构建知识产品。管理等工具,扩大其潜在市场规模。

Enterprise search is a difficult market. Consumer search has been able to succeed largely because of the massive amount of data on the public internet these types of products have to work with. Other companies, like Elastic and Algolia have found some success enabling search capabilities for developers. But internal enterprise search has continued to be an elusive market.

企业搜索是一个困难的市场。消费者搜索之所以能够成功,很大程度上是因为这些类型的产品必须使用公共互联网上的大量数据。 Elastic 和 Algolia 等其他公司在为开发人员提供搜索功能方面取得了一些成功。但内部企业搜索仍然是一个难以捉摸的市场。

Increasing privacy concerns and frequently changing policies around data and compliance pose obstacles for Glean. To ask enterprises, especially those in the financial and healthcare sectors, to willingly provide access to their entire internal documents and data, is a large ask.

日益增长的隐私问题以及频繁变化的数据和合规性政策给 Glean 带来了障碍。要求企业,特别是金融和医疗保健行业的企业愿意提供其全部内部文件和数据的访问权限,是一个很高的要求。

目前他们解决这个问题的一种方法是使用单租户架构设计他们的产品。这意味着每个客户都在自己的环境中运行自己的 Glean 实例。单一租赁不仅使客户能够更好地控制和定制其实例,而且至关重要的是,还为他们提供了最高级别的安全性。

Glean has a potential cold start problem. Enterprises cannot draw any inferences for users or items if it hasn’t gathered enough information. In one interview, a Glean customer indicated that even post-launch for Glean, adoption among employees could be as low as 20-40%. Adoption can also be relatively uneven across different teams. For engineering, product, and IT, given the amount of documents they create and the amount of channels they communicate through, folks use Glean on a regular basis. However, for teams that are more operational like G&A and marketing, adoption may not been as consistent or strong.

Glean 存在潜在的冷启动问题。如果企业没有收集到足够的信息,就无法对用户或物品做出任何推断。在一次采访中,一位 Glean 客户表示,即使 Glean 推出后,员工的采用率也可能低至 20-40%。不同团队的采用率也可能相对不平衡。对于工程、产品和 IT 领域,考虑到他们创建的文档数量以及沟通渠道的数量,人们会定期使用 Glean。然而,对于 G&A 和营销等更具运营性的团队来说,使用频率会比较低。

Glean 的基础建立在他们过去 4.5 年开发的搜索产品之上。它不仅为 Glean Chat(更多内容见下文)提供支持,还为企业希望创建的任何生成式人工智能应用程序提供支持。因此,Glean 无意中成为了标准的企业生成式人工智能平台,无缝整合了大量公司特定知识,并使其易于用于各种人工智能应用程序。

从技术上来说,Glean 的方法是独一无二的。他们利用较小的开放域模型,例如 Bard family,然后为每个客户定制这些模型。这种定制涉及在特定企业的数据体上训练模型,确保系统理解独特的公司术语、概念、代号和首字母缩略词。然后,这些经过微调的模型可用于语义相似性和同义词检测等功能。

对于面向用户的交互,Glean 采用超大型语言模型 (LLM),如 GPT-4、3.5、PaLM 或 Llama 2。这些 LLM 通过 API 集成,主要负责生成显示到最后的 AI 驱动答案-用户。由于他们的主要作用是总结和综合,因此无需进一步培训这些LLM。

Glean 正在尝试改革知识工作者查找和消费信息的方式。通过与企业内尽可能多的知识数据库集成,Glean 可以构建上下文知识图来了解哪些信息是关键的、信息存储在哪里以及公司中的谁拥有该信息的任何可能的附加上下文。随着 Glean 继续向更大的企业扩张,该公司将必须展示处理日益复杂的知识图谱的能力,以继续有效地为客户提供服务。