MongoDB Blog

Announcements, updates, news, and more

Announcing Search Index Management in MongoDB Compass

You can now create and manage Atlas Search and Atlas Vector Search indexes on the interface many of you know and love: MongoDB Compass . Seamlessly build full-text and semantic search applications on top of your Atlas database, delivering swift and relevant results for a range of use cases including e-commerce sites, customer support chatbots, recommendation systems, and more. Gone are the days of juggling multiple tools to bring your search queries to fruition. And, with a variety of templates to choose from, Compass simplifies learning search index syntax so you can focus on what’s most important to you: building exceptional end-user experiences on top of your search queries. Try it out To get started, connect to an Atlas cluster from Compass. If you don’t have one, sign up . From there, simply navigate to Compass’ Indexes tab and select Create Search Index . It’s easy to build your first search index using one of our templates. Select either Search or Vector Search, and use the appropriate template. In this example, we’re going to create a Vector Search index. Once you're satisfied with your index definition, click Aggregate to start testing out your pipeline in Compass. Compass’ new search index experience leads you to results in just three guided steps, all without leaving the comfort of Compass. To learn more about search indexing in Compass, visit our documentation . If you have feedback about Compass’ search index experience, let us know on our feedback forum . Happy indexing!

March 18, 2024
Updates

From Relational Databases to AI: An Insurance Data Modernization Journey

Imagine you’re a data architect, a developer, or a data engineer at an insurance company. Management has asked you and your team to build a new AI claim adjustment system, a customer-facing LLM-powered chatbot, and an application to streamline the underwriting process. However, doing so is far from straightforward due to the challenges you face on a daily basis. The bulk of your time is spent navigating your company’s outdated legacy systems, which were built in the 1970s and 1980s. Some of these legacy platforms were written in COBOL and CICS, and today very few people on your team know how to develop and maintain those technologies. Moreover, the data models you work with are another source of frustration. Every interaction with them is a reminder of the intricate structures that have evolved over time, making data manipulation and analysis a nightmare. In sum, legacy systems are preventing your team—and your company—from innovating and keeping up with both your industry and customer demands. Whether you’re trying to modernize your legacy systems to improve operational efficiency, or to boost developer productivity, or if you want to build AI-powered apps that integrate with large language models (LLMs), MongoDB has a solution for that. In this post, we’ll walk you through a journey that starts with a relational data model refactored into MongoDB collections, vectorization and querying of unstructured data and, finally, retrieval augmented generation (RAG) : asking large language models (LLMs) questions about data in natural language. Identifying, modernizing, and storing the data Our journey starts with an assessment of the data sources we want to work with. As shown below, we can bucket the data into three different categories: Structured legacy data: Tables of claims, coverages, billings, and more. Is your data locked in rigid relations schemas? This tutorial is a step-by-step guide on how to migrate a real-life insurance relational model with the help of MongoDB Relational Migrator , refactoring 21 tables to only five MongoDB collections. Structured data (JSON): You might have files of policies, insurance products, or forms in JSON format. Check out our docs to learn how to insert those into a MongoDB collection. Unstructured data (PDFs, Audios, Images, etc.): If you need to create and store a numerical representation (vector embedding) of, for instance, claim-related photos of accidents or PDFs of policy guidelines, you can have a look at this blog that will walk you through the process of generating embeddings of pictures of car crashes and persisting them alongside existing fields in a MongoDB collection. Figure 1: Storing different types of data into MongoDB Regardless of the original format or source, our data has finally landed into MongoDB Atlas into what we call a Converged AI Data Store, which is a platform that centrally integrates and organizes enterprise data, including vectors, that enable the development of ML- and AI-powered applications. Accessing, experimenting and interacting with the data It’s time to put the data to work. The Converged AI Data Store unlocks a plethora of use cases and efficiency gains, both for the business and for developers. The next step of the journey is about the different ways we can interact with our data: Database and Full Text Search: Learn how to run database queries, start from the basics and move up to advanced features such as facets, fuzzy search, autocomplete, highlighting, and more with Atlas Search . Vector Search: We can finally leverage unstructured data. The Image Search blog we mentioned earlier also explains how to create a Vector Search index and run vector queries against embeddings of photos. RAG: Combining Vector Search and the power of LLMs, it is possible to interact in natural language with our data (see Figure 2 below), asking complex questions and getting detailed answers. Follow this tutorial to become a RAG expert. Figure 2: Retrieval augmented generation (RAG) diagram where we dynamically combine our custom data with the LLM to generate reliable and relevant outputs Having explored all the different ways we can ask questions of the data, we made it to the end of our journey. You are now ready to modernize your company’s systems and finally be able to keep up with the business’ demands. What will you build next? If you would like to discover more about Converged AI and Application Data Stores with MongoDB, take a look at the following resources: AI, Vectors, and the Future of Claims Processing: Why Insurance Needs to Understand The Power of Vector Databases Build a ML-Powered Underwriting Engine in 20 Minutes with MongoDB and Databricks

March 14, 2024
Applied

How MongoDB Enables Digital Twins in the Industrial Metaverse

The integration of MongoDB into the metaverse marks a pivotal moment for the manufacturing industry, unlocking innovative use cases across design and prototyping, training and simulation, and maintenance and repair. MongoDB's powerful capabilities — combined with Augmented Reality (AR) or Virtual Reality (VR) technologies — are reshaping how manufacturers approach these critical aspects of their operations, while also enabling the realization of innovative product features. But first: What is the metaverse, and why is it so important to manufacturers? We often use the term, "digital twin" to refer to a virtual replication of the physical world. It is commonly used for simulations and documentation. The metaverse goes one step further: Not only is it a virtual representation of a physical device or a complete factory, but the metaverse also reacts and changes in real time to reflect a physical object’s condition. The advent of the industrial metaverse over the past decade has given manufacturers an opportunity to embrace a new era of innovation, one that can enhance collaboration, visualization, and training. The industrial metaverse is also a virtual environment that allows geographically dispersed teams to work together in real time. Overall, the metaverse transforms the way individuals and organizations interact to produce, purchase, sell, consume, educate, and work together. This paradigm shift is expected to accelerate innovation and affect everything from design to production across the manufacturing industry. Here are some of the ways the metaverse — powered by MongoDB — is having an impact manufacturing. Design and prototyping Design and prototyping processes are at the core of manufacturing innovation. Within the metaverse, engineers and designers can collaborate seamlessly using VR, exploring virtual spaces to refine and iterate on product designs. MongoDB's flexible document-oriented structure ensures that complex design data, including 3D models and simulations, is efficiently stored and retrieved. This enables real-time collaboration, accelerating the design phase while maintaining the precision required for manufacturing excellence. Training and simulation Taking a digital twin and connecting it to physical assets enables training beyond traditional methods and provides immersive simulations in the metaverse that enhance skill development for manufacturing professionals. VR training, powered by MongoDB's capacity to manage diverse data types — such as time-series, key-values and events — enables realistic simulations of manufacturing environments. This approach allows workers to gain hands-on experience in a safe virtual space, preparing them for real-world challenges without affecting production cycles. Gamification is also one of the most effective ways to learn new things. MongoDB's scalability ensures that training data, including performance metrics and user feedback, is efficiently handled to continuously enlarge the training modules and the necessary resources for the ever-increasing amount of data. Maintenance and repair Maintenance and repair operations are streamlined through AR applications within the metaverse. The incorporation of AR and VR technologies into manufacturing processes amplifies the user experience, making interactions more intuitive and immersive. Technicians equipped with AR devices can access real-time information overlaid onto physical equipment, providing step-by-step guidance for maintenance and repairs. MongoDB's support for large volumes of diverse data types, including multimedia and spatial information, ensures a seamless integration of AR and VR content. This not only enhances the visual representation of data from the digital twin and the physical asset but also provides a comprehensive platform for managing the vast datasets generated during AR and VR interactions within the metaverse. Additionally, MongoDB's geospatial capabilities come into play, allowing manufacturers to manage and analyze location-based data for efficient maintenance scheduling and resource allocation. The result is reduced downtime through more efficient maintenance and improved overall operational efficiency. From the digital twin to metaverse with MongoDB The advantages of a metaverse for manufacturers are enormous, and according to Deloitte many executives are confident the industrial metaverse “ will transform research and development, design, and innovation, and enable new product strategies .” However, the realization is not easy for most companies. Challenges include managing system overload, handling vast amounts of data from physical assets, and creating accurate visualizations. The metaverse must also be easily adaptable to changes in the physical world, and new data from various sources must be continuously implemented seamlessly. Given these challenges, having a data platform that can contextualize all the data generated by various systems and then feed that to the metaverse is crucial. That is where MongoDB Atlas , the leading developer data platform, comes in, providing synchronization capabilities between physical and virtual worlds, enabling flexible data modeling, and providing access to the data via a unified query interface as seen in Figure 1. Figure 1: MongoDB connecting to a physical & virtual factory Generative AI with Atlas Vector Search With MongoDB Atlas, customers can combine three systems — database, search engine, and sync mechanisms — into one, delivering application search experiences for metaverse users 30% to 50% faster . Atlas powers use cases such as similarity search, recommendation engines, Q&A systems, dynamic personalization, and long-term memory for large language models (LLMs). Vector data is integrated with application data and seamlessly indexed for semantic queries, enabling customers to build easier and faster. MongoDB Atlas enables developers to store and access operational data and vector embeddings within a single unified platform. With Atlas Vector Search , users can generate information for maintenance, training, and all the other use cases from all possible information that is accessible. This information can come from text files such as Word, from PDFs, and even from pictures or sound streams from which an LLM then generates an accurate semantic answer. It’s no longer necessary to keep dozens of engineers busy, just creating useful manuals that are outdated at the moment a production line goes through first commissioning. Figure 2: Atlas Vector Search Transforming the manufacturing industry with MongoDB In the digital twin and metaverse-driven future of manufacturing, MongoDB emerges as a linchpin, enabling cost-effective virtual prototyping, enhancing simulation capabilities, and revolutionizing training processes. The marriage of MongoDB with AR and VR technologies creates a symbiotic relationship, fostering innovation and efficiency across design, training, and simulation. As the manufacturing industry continues its journey into the metaverse, the partnership between MongoDB and virtual technologies stands as a testament to the transformative power of digital integration in shaping the future of production. Learn more about how MongoDB is helping organizations innovate with the industrial metaverse by reading how we Build a Virtual Factory with MongoDB Atlas in 5 Simple Steps , how IIoT data can be integrated in 4 steps into MongoDB, or how MongoDB drives Innovations End-To-End in the whole Manufacturing Chain .

March 12, 2024
Applied

Building AI With MongoDB: How GoBots AI for E-commerce Increases Retailer Sales Conversion by 40%

Major retail brands have long been using various forms of AI, for example statistical analysis and machine learning models, to better serve their customers. But with its high barriers to entry, one key channel has been slower to embrace the technology. By connecting large and small brands with customers, e-commerce marketplaces such as Amazon, Mercado Libre, and Shopify are among the fastest growing retail routes to market. Since 2016, GoBots has been working to extend the benefits of AI to any retailer on any marketplace. It uses AI, analytics, and MongoDB Atlas to make e-commerce easier, more convenient, and smarter for brands serving Latin America. “We are building an AI-driven customer service platform that revolutionizes e-commerce experiences,” says Victor Hochgreb, Co-Founder and CEO of GoBots. “Our solution makes the benefits of AI available to any retailer, whether large or small. With our GoBots natural language understanding (NLU) model, retailers automate customer interactions such as answering questions and resolving issues through intelligent assistants. At the same time, they leverage data analytics to offer personalized customer experiences.” Hochgreb goes on to say, “GoBots increases engagement and conversion rates for over 600 clients across Latin America, including Adidas, Bosch, Canon, Chevrolet, Dell, Electrolux, Hering, HP, Nike, and Samsung.” Check out our AI resource page to learn more about building AI-powered apps with MongoDB. Figure 1: GoBots NLU AI models analyze customer questions and issues, providing human-like answers in seconds Exploring GoBot's AI stack GoBots’ custom NLU models are built using the Rasa framework. Hochgreb says, “We have a neural network trained on over 150 million question-answer examples and more than 50 bots — specialists in different segments — to understand more specific questions.” Models are fine tuned with data from the retailer's own product catalog and website corpus. The model runtime is powered by a PyTorch microservice on Google Cloud . The larger GoBots platform is built with Kotlin and orchestrated by Kubernetes, providing the company with cloud freedom as its business expands and evolves. Figure 2: GoBots question processing architecture The GoBots AI assistants kick into action as soon as a customer asks a question on the marketplace site, with the questions stored in MongoDB Atlas . GoBots’ natural language models are programmatically called via a REST API to perform tasks like named entity recognition (NER), user intent detection, and question-answer generation with all inferences also stored in MongoDB. If the models are able to generate an answer with high confidence, the GoBots service will respond directly to the customer in real time. In case of a low confidence response, the models flag the question to a customer service representative who receives a pre-generated suggested response. They can then verify the response and reply to the customer. Increasingly the company’s engineers are also evaluating the capabilities of large language models (LLMs) to respond to customer questions. It is testing both commercial models from OpenAI as well as open source models such as Llama-2 and Mixtral hosted on Hugging Face. With all question-answer pairs from the different models written to the MongoDB Atlas database, the data is used to further tune the natural language models while also guiding model evaluations. The company has also recently started using Atlas Vector Search to identify and retrieve semantically similar answers to past questions. The search results power a co-pilot-like experience for customer service representatives and provide in-context training to its fleet of LLMs. Having our source data and metadata stored and synced side by side with our vector embeddings dramatically accelerates how quickly my developers build with AI. It also improves the quality of the outputs we return to customers, driving higher conversions and customer satisfaction. Victor Hochgreb, Co-Founder and CEO of GoBots Why MongoDB? With the power of MongoDB’s developer data platform and flexibility of MongoDB’s document model, GoBots builds higher-performing AI-powered applications faster: MongoDB Atlas provides a single data platform that serves multiple operational and AI use cases. This includes user data and product catalogs as well as a store for AI model inferences, outputs of multiple AI models for experimentation and evaluation purposes, a data source for fine-tuning models, and for vector search. The company is evaluating the use of Atlas Triggers for invoking AI model API calls in an event-driven manner as the underlying data changes. The field of natural language processing is rapidly progressing with new AI models released all the time. Finding the right AI model for a use case that balances the performance-price tradeoff requires experimentation on historical data. The flexibility provided by MongoDB’s document model allows the development team to continually enrich historical questions with outputs generated by different models and compare the results. This means that they are not blocked behind complex schema changes that would otherwise slow down the pace of harnessing new data in their models for training and inference. The question-answer pairs output by the company’s NLU models and LLMs are complex data structures with many nested entities and arrays. Being able to persist these directly to the database without first having to transform them into a tabular structure improves developer productivity and reduces application latency. It was this flexibility that was behind the decision to use MongoDB from the outset. “We were building fast, continually testing new features to scale what worked and kill what didn’t,” says Hochgreb. “Only MongoDB provided the developer ease of use and flexibility to meet my time-to-market demands”. The company initially ran MongoDB itself before upgrading to MongoDB Atlas in 2019. “The company was growing fast and I wanted to focus my engineering team on building, not operating. That is exactly what Atlas and its managed service enabled us to do,” says Hochgreb. “With Atlas we were able to maintain high uptime in the face of constant service scaling, with deep monitoring and observability into our platform. In the first year of running in MongoDB Atlas we were able to avoid hiring a full-time infrastructure engineer, and instead redirected the resource into my development team, building new customer features.” GoBots has been able to expand MongoDB usage to deliver even higher value features in its platform over time. It uses MongoDB’s app-driven intelligence to power dashboards that help retailers track questions and complaints, identify opportunities, measure marketing activities, and optimize the customer journey across the marketplace. Its adoption of Atlas Vector Search is the latest example of how the company is expanding application functionality without losing the benefits of building and running on the single, unified Atlas developer data platform. Figure 3: Real-time analytics provide retailers with instant insights to better serve their customers and grow revenue. The results and what's next By working with hundreds of customers running on Latin America’s largest marketplaces, GoBots has built a compelling track record of achievement: By using GoBots AI for ecommerce with MongoDB Atlas, customers have grown sales conversions by 40% and reduced time to customer response by 72%. Looking forward, GoBots adoption of generative AI and vector search will further drive results across the retail marketplace experience. Being part of MongoDB’s AI Innovators Program provides GoBots with free Atlas credits along with access to live technical reviews, helping the company de-risk AI developments. If you are building your own AI-powered apps, apply for the program and take MongoDB Atlas for a spin. It's the quickest way to see why retailers around the world use MongoDB .

March 6, 2024
Artificial Intelligence

RegData & MongoDB: Streamline Data Control and Compliance

While navigating the requirements of keeping data secure in highly regulated markets, organizations can find themselves entangled in a web of costly and complex IT systems. Whether it's the GDPR safeguarding European personal data or the Monetary Authority of Singapore's guidelines on outsourcing and cloud computing , the greater the number of regulations organizations are subjected to, particularly across multiple geographical locations, the more intricate their IT infrastructure becomes, and organizations today face the challenge of adapting immediately or facing the consequences. In addition to regulations, customer expectations have become a major driver for innovation and modernization. In the financial sector, for example, customers demand a fast and convenient user experience with real-time access to transaction info, a fully digitized mobile-first experience with mobile banking, and personalization and accessibility for their specific needs. While these sorts of expectations have become the norm, they conflict with the complex infrastructures of modern financial institutions. Many financial institutions are saddled with legacy infrastructure that holds them back from adapting quickly to changing market conditions. Established financial institutions must find a way to modernize, or they risk losing market share to nimble challenger banks with cost-effective solutions. The banking market today is increasingly populated with nimble fintech companies powered by smaller and more straightforward IT systems, which makes it easier for them to pivot quickly. In contrast, established institutions often operate across borders, meaning they must adhere to a greater number of regulations. Modernizing these complex systems requires the simultaneous introduction of new, disruptive technology without violating any regulatory constraints, akin to driving a car while changing a tire. The primary focus for established banks is safeguarding existing systems to ensure compliance with regulatory constraints while prioritizing customer satisfaction and maintaining smooth operations as usual. RegData: Compliance without risk Multi-cloud application security platform, RegData embraces this challenge head-on. RegData has expertise across a number of highly regulated markets, from healthcare to public services, human resources, banking, and finance. The company’s mission is clear—delivering a robust, auditable, and confidential data protection platform within their comprehensive RegData Protection Suite (RPS), built on MongoDB. RegData provides its customers with more than 120 protection techniques , including 60 anonymization techniques, as well as custom techniques (protection of IBANs, SSNs, emails, etc), giving them total control over how sensitive data is managed within each organization. For example, by working with RegData, financial institutions can configure their infrastructure to specific regulations, by masking, encrypting, tokenizing, anonymizing, or pseudonymizing data into compliance. With RPS, company-wide reports can be automatically generated for the regulating authorities (i.e., ACPR, ECB, EU-GDPR, FINMA, etc.). To illustrate the impact of RPS, and to debunk some common misconceptions, let’s explore before and after scenarios. Figure 1 shows the decentralized management of access control. Some data sources employ features such as Field Level Encryption (FLE) to shield data, restricting access to individuals with the appropriate key. Additionally, certain applications implement Role-Based Access Control (RBAC) to regulate data access within the application. Some even come with an Active Directory (AD) interface to try and centralize the configuration. Figure 1: Simplified architecture with no centralized access control However, each of these only addresses parts of the challenge related to encrypting the actual data and managing single-system access. Neither FLE nor RBAC can protect data that isn’t on their data source or application. Even centralizing efforts like the AD interface exclude older legacy systems that might not have interfacing functionalities. The result in all of these cases is a mosaic of different configurations in which silos stay silos, and modernization is risky and slow because the data may or may not be protected. RegData, with its RPS solution, can integrate with a plethora of different data sources as well as provide control regardless of how data is accessed, be it via the web, APIs, files, emails, or others. This allows organizations to configure RPS at a company level. All applications including silos can and should interface with RPS to protect all of the data with a single global configuration. Another important aspect of RPS is its functions with tokenization, allowing organizations to decide which columns or fields from a given data source should be encrypted according to specific standards and govern the access to corresponding tokens. Thanks to tokenization, RPS can track who accesses what data and when they access it at a company level, regardless of the data source or the application. This is easy enough to articulate but quite difficult to execute at a data level. To efficiently manage diverse data sources, fine-grained authorization, and implement different protection techniques, RegData builds RPS on top of MongoDB's flexible and document-oriented database. The road to modernization As noted, to fully leverage RegData’s RPS, all data sources should go through the RPS. RPS works like a data filter, putting in all of the information and extracting protected data on the other side, to modernize and innovate. Just integrating RegData means being able to make previously siloed data available by masking, encrypting, or anonymizing it before sending it out to other applications and systems. Together, RegData and MongoDB form a robust and proven solution for protecting data and modernizing operations within highly regulated industries. The illustration below shows the architecture of a private bank utilizing RPS. Data can only be seen in plain text to database admins when the request comes from the company’s headquarters. This ensures compliance with regulations, while still being able to query and search for data outside the headquarters. This bank goes a step further by migrating their Customer Relationship Management (CRM), core banking, Portfolio Management System (PMS), customer reporting, advisory, tax reporting, and other digital apps into the public cloud. This is achieved while still being compliant and able to automatically generate submittable audit reports to regulating authorities. Figure 2: Private bank business care Another possible modernization scheme—given RegData’s functionalities—is a hybrid cloud Operational Data Layer (ODL), using MongoDB Atlas . This architectural pattern acts as a bridge between consuming applications and legacy solutions. It centrally integrates and organizes siloed enterprise data, rendering it easily available. Its purpose is to offload legacy systems by providing alternative access to information for consuming applications, thereby breaking down data silos, decreasing latency, allowing scalability, flexibility, and availability, and ultimately optimizing operational efficiency and facilitating modernization. RegData integrates, protects, and makes data available, while MongoDB Atlas provides its inherent scalability, flexibility, and availability to empower developers to offload legacy systems. Figure 3: Example of ODL with both RegData and MongoDB In conclusion, in a world where finding the right solutions can be difficult, RegData provides a strategic solution for financial institutions to securely modernize. By combining RegData's regulatory protection and modern cloud platforms such as MongoDB Atlas, the collaboration takes on the modernizing challenge of highly regulated sectors. Are you prepared to harness these capabilities for your projects? Do you have any questions about this? Then please reach out to us at industry.solutions@mongodb.com or info@regdata.ch You can also take a look at the following resources: Hybrid Cloud: Flexible Architecture for the Future of Financial Services Implementing an Operational Data Layer

February 29, 2024
Applied

Atlas Data Federation and Online Archive Can Now Be Deployed in Azure

This post is also available in: Deutsch , Français , Español , Português , Italiano , 한국어 , 简体中文 . Exciting developments are on the horizon for users of Microsoft Azure, marking a significant leap in data management capabilities. First off, Atlas Data Federation is now Generally Available on Azure. This means you can now deploy it directly within Azure and even query data from Microsoft Azure Blob Storage. And that's not all. We've also launched the General Availability of Atlas Online Archive on Azure. These advancements usher in a new era of efficient archiving solutions for Azure-based data solutions. Both updates are big steps forward in making data management on Azure more powerful and flexible. Let's dive into what this means for you! Azure support in Atlas Data Federation (General Availability) With Atlas Data Federation, users can seamlessly query, transform, and create views across multiple Atlas databases and cloud object storage solutions, such as Amazon S3 and now Microsoft Azure Blob Storage. This feature, previously exclusive to AWS, is a game-changer, allowing direct deployment within Azure and the ability to tap into Microsoft Azure Blob Storage for data insights. Figure 1: Tap into Azure Blob Storage easily from the Atlas UI Key features of Atlas Data Federation Cloud flexibility: Choose between AWS and Azure for hosting federated database instances. Diverse data sources: Incorporate MongoDB Atlas clusters or Azure storage solutions (Azure Blob Storage and Azure Data Lake Storage Gen2) as data sources for comprehensive queries, including cross-region. Advanced aggregation: Comprehensive aggregation capabilities with operators inclusive of $match, $lookup, $queryHistory, $merge, $out, etc. Direct $out support for Azure Blob Storage and Azure Data Lake Storage Gen2. Atlas SQL queries on Azure: Execute SQL queries on Azure, integrating MongoDB data for a unified analysis experience. Atlas Data Federation simplifies accessing and analyzing complex data sets by combining data across multiple sources into a single, federated view, providing valuable insights for more informed business decisions. Explore Atlas Data Federation on Azure Today . Azure support in Atlas Online Archive (General Availability) Atlas Online Archive's expansion to Azure ensures that data tiering is not only efficient but also integrated, keeping archival data within the Azure ecosystem. This integration addresses the previous limitation of defaulting to AWS for storage, even for Azure-hosted clusters. Figure 2: Seamlessly select Azure when choosing a region Key features of Atlas Online Archive Provider choice: Opt for AWS or Azure to align with your cloud strategy. Automatic archiving: Set rules to move older data to cost-effective cloud storage automatically, eliminating manual offloading. Unified querying endpoint: Access all data through a single endpoint, ensuring quick insights without compromising data availability. Integrated MongoDB Atlas UI management: Manage your data tiering and archiving within the familiar Atlas interface, streamlining operations and maintenance. Seamlessly manage your MongoDB Atlas data tiering at scale with Atlas Online Archive. Atlas Online Archive empowers you to manage your data lifecycle efficiently, balancing cost and accessibility with ease. Finally, here are a few points to consider: Any newly created archive on an Azure cluster on or after 02/28 will default to Azure regions. Note that storage regions for Online Archive will default to Azure clusters only if there are no pre-existing AWS archives on that specific Azure cluster If there are any pre-existing AWS Online Archives on Azure clusters, then all newly created archives on that specific cluster will remain on AWS. Cloud providers or storage regions cannot be edited or modified once configured Embrace the Full Potential of Atlas Online Archive on Azure Today . We're thrilled to support your data management journey, offering enhanced control and flexibility over your data through these new Azure capabilities. As MongoDB Atlas continues to expand as a multi-cloud solution, we're here to ensure your data strategy is as dynamic and versatile as your business needs. For guidance on getting started, check out our documentation on Atlas Data Federation or Atlas Online Archive . Thank you for trusting MongoDB Atlas as your Developer Data Platform. Welcome to the future of multi-cloud data management!

February 29, 2024
Updates

They Asked, We Answered: A Q&A on Joining MongoDB’s Remote Solutions Center

Our Remote Solutions Center (RSC) team offers those with technical backgrounds interested in working with customers an opportunity to jumpstart a career in pre-sales. We asked Soheyl Rafi, Solutions Architect and former Remote Solutions Center team member, some common questions candidates have about joining the team. What is the day-to-day like on the Remote Solutions Center team? Working in the Remote Solutions Center is a dynamic and ever-changing experience, with each day bringing unique challenges and opportunities. You can expect a blend of calls, hands-on activities, customer interactions, and problem-solving. The variety keeps the role super interesting. Part of the diverse work environment comes from the collaboration that this role inherently entails. Throughout the day, you’ll engage in various customer interactions such as discussing project requirements and challenges to proposing tailored solutions. Another aspect could be your involvement in Technical Feasibility Workshops with customers. This is where your deep technical knowledge comes into play. You’ll be addressing intricate technical questions, providing insights, and ensuring that our solutions align with the customer’s needs. Enablement also plays a pivotal part in the day-to-day. You will spend a lot of time learning new technologies and features released by our Product team, understanding the competition, and staying abreast of the market as a whole. All in all, the day-to-day work is extremely diverse, and you’ll need to both enjoy and be comfortable wearing multiple hats. What can I expect during the onboarding phase? When you join the RSC, you can expect a comprehensive onboarding experience that covers both technical and sales aspects. Our onboarding plan is designed to provide hands-on training, ensuring that you become familiar with MongoDB technology and business processes. From a technical standpoint, you’ll have access to detailed training sessions and resources. You’ll be guided through the intricacies of our products and services, allowing you to build a strong foundation in your technical knowledge. On the sales front, we have a tailored onboarding plan that focuses on honing the skills required for successful client engagement. This includes understanding our market positioning, customer needs, and effective sales strategies. You will be exposed to real-world scenarios and practical exercises to enhance your sales acumen. To complement your onboarding experience, each new team member is paired with a more senior peer within the RSC. This mentorship helps you begin to build relationships within the team and provides valuable insights and guidance on navigating the role effectively. In addition, RSC leadership recognizes the importance of learning from industry veterans. That’s why each new hire is also paired with a seasoned professional from outside the RSC who acts as a Solution Architecture buddy. This experienced mentor with tenure in the industry will offer a unique perspective and share valuable insights to accelerate your learning. To gain practical exposure, you will have the opportunity to shadow calls and workshops conducted by experienced team members. This hands-on approach allows you to observe firsthand how we conduct business, manage client interactions, and collaborate within the team. In summary, our onboarding program within the RSC is a holistic approach that combines technical training, sales development, mentorship, and practical experience. How often can I expect to collaborate with Solutions Architects (SAs) in the field? As an SA within the RSC, collaboration with SAs in the field is consistently high. We are strategically positioned around account activities and proactively engage with our counterparts in the field teams. Depending on the opportunity during an engagement, you might assist in the early stages of the sales cycle, such as discovery and demos, before passing the deal onto the field SA. Collaboration in these engagements is pivotal, requiring alignment to ensure a seamless handover to the field teams. What internal career development opportunities are there for me? Upon joining the RSC team, you’ll immerse yourself in a dynamic learning environment. Unlike traditional career structures, the RSC team fosters an environment where you are in the driver’s seat, dictating the pace of your internal career development. No artificial roadblocks hinder you from taking on more challenging and senior-level tasks. Your dedication, skills, and initiative are the primary determinants of how quickly you progress. While there are specific tasks you’ll work on in your day-to-day, you’ll also have a wide range of internal projects available to enhance your skill sets and advance your career within MongoDB. In a team with diverse interests and skills, you can choose projects that align with your passion and interests. You’ll find yourself in a situation with structured learning paths, mentorship programs, project leadership opportunities, and a career trajectory limited only by your aspirations. In my career journey, I’ve achieved my goal of growing from an Associate SA to being promoted to a Solutions Architect for our enterprise customers. I am now a dedicated Solutions Architect to one of our biggest financial customers, helping them in their digital transformation and expanding their MongoDB footprint. What new things will I learn if I join the team? The question should be, “What will you not learn?” Databases are at the center of every tech stack. You will be exposed to and gain an understanding of the entire tech stack, including the underlying infrastructure and the application that will be built on top of MongoDB. In your role, you’ll find yourself engaging with customers to discuss the various technologies used in application development, the infrastructure decisions made, and other database solutions that are either part of their tech stack or under consideration. Each customer employs different methodologies for developing software and utilizes various programming languages and solutions. It is pivotal as an SA in the RSC to comprehend these diverse solutions and effectively communicate them to our customers. Beyond the technical aspects, you will start to see things from a macro perspective. Understanding your customer’s business is crucial. You will need to learn how to align technical solutions with business objectives, considering factors such as budget, timelines, and return on investment. Learn more about applying your technical skills and engaging with customers as part of our Remote Solutions Center.

February 28, 2024
Culture

Building AI With MongoDB: Story Tools Studio Brings Gen AI To Gaming With Myth Maker AI

Story Tools Studio harnesses cutting-edge generative AI (gen AI) technologies to craft immersive, personalized, and infinite storytelling experiences.​​​ Roy Altman, Founder and CEO at Story Tools Studio says, “Our flagship game Myth Maker AI leverages MUSE, an internally developed AI-powered, expert-guided story generator that blends a growing collection of advanced AI technology with creative artistry to weave real time narratives.” The company’s founders have backgrounds in stage, film, and video production, and both share a deep love of gaming. When ChatGPT first launched, they immediately recognized an opportunity to redefine gaming experiences. And so Story Tools Studio and its MUSE engine were born. MUSE (Modular User Story Engine) combines professionally crafted stories with user-empowered experiences. Players make intentional choices that guide the story with AI adapting to each decision in real time, providing a unique and personalized journey. MUSE separates the story from game mechanics, allowing the development of multiple game types. Its use of AI creates more agile teams with fewer dependencies. Generating gameplay with AI When a player starts a game in Myth Maker AI , they are presented with the option to choose their starting hero character. Under the covers, MUSE calls the GPT4 API, which takes the player’s selection and writes a fully customized adventure premise. From that initial personalized script, MUSE programmatically calls specialized AI models to collaboratively generate an immersive, multimodal gaming experience using images, animation, audio, and soon, video and 3D. Figure 1:   MUSE orchestrates multimodel genAI to create real time, unlimited stories Altman says, “For story generation and text to voice, we run mainly in Azure’s OpenAI service. Visual assets are created via Leonardo AI, and we are constantly experimenting with new models to create richer modalities. Right now my team is working on generating enhanced 3D assets and video from text prompts. With the pace of AI advancement, the creativity of my team, and the input from our game testers, we are continuously deploying new features. And MongoDB with its dynamic and flexible document data model gives my developers the freedom to do that. This enables us to build a truly innovative, artistic platform, opening up a whole new world of experiences for both creators and audiences alike.” By selecting MongoDB, we were able to create a prototype of our game in just 48 hours. It is only with MongoDB that we can release new features to production multiple times per day. We couldn’t achieve any of this with a relational database. Roy Altman, Founder and CEO at Story Tools Studio AI, transactions, and analytics with MongoDB Altman’s engineering team has used MongoDB Atlas from the very start of the company. MongoDB stores all of the data used in the platform — user data, scripts, characters, worlds, coins, and prompts are all richly structured objects stored natively in MongoDB. The games are built in React and Javascript. Beyond gameplay, the company’s developers are now exploring MongoDB’s ACID transactional integrity to support in-game monetization, alongside in-app intelligence to further improve the gaming experience through player analytics. “By running MongoDB in Atlas, my engineering team is free to focus on AI-driven gaming experiences, and not on the grind of managing a database,” says Altman. “MongoDB has scaled seamlessly and automatically as we’ve graduated from our closed beta into public beta today. Every 24 hours we are organically adding dozens of new players with tens of gigabytes of new data streaming into the platform. We expect this to grow quickly when we kick off our marketing campaigns.” What's next? Myth Maker AI is just the start. Story Tools Studio has plans for multiple games in the next 12 to 18 months. At the same time MUSE is an extensible platform — the company is also looking to explore content generation outside of gaming in areas such as education and training. Story Tools Studio is a member of the MongoDB AI Innovators program providing it with access to Atlas credits and technical best practices, allowing its engineers to freely explore what’s possible with gen AI and MongoDB. You can get started with your AI-powered apps by registering for MongoDB Atlas and exploring the tutorials available in our AI resources center . You can also learn more about why the world’s largest games run on MongoDB .

February 27, 2024
Artificial Intelligence

Announcing the GA of the Atlas Device SDK for C++

This post is also available in: Deutsch , Français , Español , Português , Italiano , 한국어 , 简体中文 . MongoDB's Developer Data Platform was designed to offer unparalleled flexibility and scalability for developers. By streamlining the integration of complex data structures and real-time analytics, and accelerating the development and deployment of mission-critical applications, its adoption has added significant value to businesses across industries. Today we continue our mission to provide the best experience for developers and are excited to announce the general availability (GA) of the Atlas Device SDK for C++ . The updates in this release come after numerous iterations that were guided by feedback from our preview users and target performance and portability. The Atlas Device SDK for C++ enables developers to effortlessly store data on devices for offline access while seamlessly synchronizing data to and from the MongoDB Atlas cloud within their C++ applications. It serves as a user-friendly alternative to SQLite, offering simplicity due to its object-oriented database nature, removing the necessity for a separate mapping layer or ORM. Aligned with MongoDB's developer data platform mission of streamlining the development process - the C++ SDK incorporates networking retry logic and advanced conflict merging functionality, eliminating the traditional need for writing and maintaining extensive and complex synchronization code. Why choose the Atlas Device SDK for C++? The Atlas Device SDK for C++ is particularly well-suited for applications in embedded devices, IoT, and cross-platform scenarios. It serves as a comprehensive object-oriented persistence layer for edge, mobile, and embedded devices, offering built-in support for synchronization with the MongoDB Atlas as a cloud backend. In the evolving landscape of connected and smart devices, the demand for more data, including historical data for automated decision-making, highlights the importance of efficient persistence layers and real-time cloud-syncing technologies which are robust towards changing network connections and outages. The database included in the Atlas Device SDK for C++, comes with over a decade of history, and is a mature, feature-rich, and enterprise-ready technology, integrated into tens of thousands of applications on Google Play and the Apple App Store with billions of downloads. Its lightweight design is optimized for resource-constrained environments. It considers factors like compute, memory, bandwidth, and battery usage in its design. Embedding the SDK directly into application code eliminates the need for additional deployment tasks and simplifies the development process. The fully object-oriented nature of the SDK guides the data modeling, providing a straightforward and idiomatic approach. This stands in contrast to alternative technologies like SQLite database, which require an object-relational mapping library, adding complexity and making future development, maintenance, and debugging more challenging. Furthermore, the SDK’s underlying data store enables seamless integration with reactive UI layers across various environments. In the Atlas Device SDK for C++ we give examples of how to integrate with the Qt framework , but other UI layers can also be added. Improvements in the GA release The new API was developed based on performance measurements with a coordinated focus and effort to improve the read/write operations of the data layer. There has been great interest from major automotive and manufacturing OEMs and this feedback has been invaluable in guiding our final API. Some of the changes added to the Atlas Device SDK for C++ include: Aligning our APIs with other Atlas Device SDKs, e.g. improved control of the database state with monitoring and manual compaction HTTP tunneling Better control for the Atlas Device Sync sessions Windows support Compatibility with OpenWRT among other Linux distributions by supporting musl Android Automotive support with Blueprint/Soong build files What's next Looking ahead we are working towards geospatial support as well as the ability to build with a variety of package managers such as vcpkg and Conan. We welcome and value all feedback - if you have any comments or suggestions, please share them via our GitHub project . Ready to get started? Install the Atlas Device SDK for C++ — start your journey with our docs or jump right into example projects with source code . Then, register for Atlas to connect to Atlas Device Sync, a fully managed mobile backend as a service. Leverage out-of-the-box infrastructure, data synchronization capabilities, network handling, and much more to quickly launch enterprise-grade mobile apps. Finally, let us know what you think, and get involved in our forums . See you there!

February 22, 2024
Updates

Should I Begin a Pre-Sales Career at MongoDB? Insights from Our Remote Solutions Center

Do you have a technical background and enjoy working with customers? Have you considered beginning a career in pre-sales? Aicha Sarr, Solutions Architect at MongoDB, shares insights into how our Remote Solutions Center offers a path to building a career in pre-sales. Read on to learn more about how our Remote Solutions Center team builds off of each other’s strengths, applies their technical expertise, and focuses on customer success. Diverse backgrounds, shared attributes Our Remote Solutions Center team values both diverse backgrounds and sharing common attributes. Team members possess a blend of technical expertise and a customer-focused mindset with a strong affinity for technology and an inherent curiosity to grasp knowledge related to MongoDB, databases, and complex technical concepts. We all come from a variety of backgrounds, such as data engineers, software developers, cloud architects, and sales engineers. While direct experience with MongoDB is beneficial, it's not mandatory. Similarly, a background as an engineer can be advantageous, but it's not strictly necessary. To succeed on the team, one needs a keen interest in the technology and architecture of systems, an appetite for learning, and a commitment to ensuring customer success. We are trusted advisors, collaborating closely with customers to design and implement reliable systems, making a strong technical alignment between the platform and customer needs that are pivotal for success. Hands-on application of technical skills I’m frequently engaged in projects that require me to apply my technical skills. Many of these projects have been focused on building reusable demos that are distributed to our global team, providing an opportunity for my work to be showcased during technical calls or workshops. For instance, a teammate and I developed a mobile application centered around MongoDB Atlas for the Edge , our edge computing solution. The project included demonstrating the synchronization of user data between intermittently internet-connected devices and a central database, MongoDB Atlas. This demo showcased the practical application of Atlas Edge computing, maintaining data consistency across various devices, reflecting the real-world implications of mobile application development and data synchronization challenges. Additionally, I’ve developed an application illustrating the implementation of a Client-Side Field Level Encryption (CSFLE)-enabled application using Amazon Web Services Key Management Service with Java Spring Boot as the programming language. This hands-on experience allowed me to showcase the robust security features of CSFLE, a feature that enables the encryption of data in an application before it is sent over the network to MongoDB - a tangible example of how to enhance application security within a cloud environment. In my customer calls, I’ve used both my demos and demos built by my colleagues. Hence, these hands-on experiences play a crucial role in enhancing our team's collective knowledge, capabilities, and resources. This collaboration between team members fosters a shared understanding and serves as a valuable learning opportunities, allowing us to disseminate the acquired knowledge internally and promote a culture of continuous learning and skill development. The customer engagement journey Our customer engagement journey begins with establishing strong relationships during initial contact and holding in-depth discussions to grasp the customer's business requirements. This phase serves as the foundation for crafting tailored proposals that outline effective solutions. Technical presentations elucidate proposed solutions' intricacies, involving iterative, collaborative discussions and adjustments based on continuous feedback from customer stakeholders, including IT teams and decision-makers. Transitioning to the implementation phase, Proof of Concept projects may be initiated for real-world solution applications. The seamless move to implementation teams ensures ongoing support, addressing post-implementation issues for continuous customer satisfaction. This holistic approach emphasizes effective communication, active listening, and building lasting partnerships grounded in trust. Engaging with diverse customer personas, each playing a distinct role in decision-making, involves tailored interactions. Technical decision-makers engage in in-depth technical discussions, addressing integration concerns and compatibility. Business decision-makers require presentations emphasizing business value, ROI, and alignment with overarching goals. Interactions with developers or data engineers involve technical implementation discussions and collaborative problem-solving in specific programming languages. These interactions with end users, project managers, and technical teams are essential for gathering insights into practical requirements, project timelines, and seamless implementation. The shared goal is to create solutions that meet business and technical requirements, ensuring feasibility, efficiency, and alignment with customer capabilities. Conclusion: A tapestry of skills and collaboration The success of our Remote Solutions Center lies in the tapestry of diverse backgrounds, shared attributes, and a commitment to continuous learning. The opportunities for hands-on application of technical skills not only strengthen our individual capabilities but also contribute to a collective knowledge pool within our team. The customer engagement process is a collaborative journey, with an emphasis on the importance of understanding and adapting to evolving customer needs. This holistic approach to team dynamics, skill development, and customer engagement positions our Remote Solutions Center for success in a rapidly changing technological landscape. Learn more about careers in solutions consulting at MongoDB.

February 21, 2024
Culture

Building AI with MongoDB: Accelerating App Development With the Codeium AI Toolkit

Of the many use cases set to be transformed by generative AI (gen AI), the bleeding edge of this revolution is underway with software development. Developers are using gen AI to improve productivity by writing higher-quality code faster. Tasks include autocompleting code, writing docs, generating tests, and answering natural language queries across a code base. How does this translate to adoption? A recent survey showed 44% of new code being committed was written by an AI code assistant. Codeium is one of the leaders in the fast-growing AI code assistant space. Its AI toolkit is used by hundreds of thousands of developers for more than 70 languages across more than 40 IDEs including Visual Studio Code, the JetBrains suite, Eclipse, and Jupyter Notebooks. The company describes its toolkit as “the modern coding superpower,” reflected by its recent $65 million Series B funding round and five-star reviews across extension marketplaces. Check out our AI resource page to learn more about building AI-powered apps with MongoDB. As Anshul Ramachandran, Head of Enterprise & Partnerships at Codeium explains, “Codeium has been developed by a team of researchers and engineers to build on the industry-wide momentum around large language models, specifically for code. We realized that our specialized generative models, when deployed on our world-class optimized deep learning serving software, could provide users with top-quality AI-based products at the lowest possible costs (or ideally, free). The result of that realization is Codeium." Codeium has recently trained its models on MongoDB code, libraries, and documentation. Now developers building apps with MongoDB can install the Codeium extension on the IDE of their choice and enjoy rapid code completion and codebase-aware chat and search. Developers can stay in the flow while they build, coding at the speed of thought, knowing that Codeium has ingested MongoDB best practices and documentation. MongoDB is wildly popular across the developer community. This is because Atlas integrates the fully managed database services that provide a unified developer experience across transactional, analytical, and generative AI apps. Anshul Ramachandran, Head of Enterprise & Partnerships, Codeium Ramachandran, goes on to say, “MongoDB APIs are incredibly powerful, but due to the breadth and richness of the APIs, it is possible for developers to be spending more time than necessary looking through API documentation or using the APIs inefficiently for the task at hand. An AI assistant, if trained properly, can effectively assist the developer in retrieval and usage quality of these APIs. Unlike other AI code assistants, we at Codeium build our LLMs from scratch and own the underlying data layer. This means we accelerate and optimize the developer experience in unique and novel ways unmatched by others.” Figure 1:   By simply typing statement names, the Codeium assistant will automatically provide code completion suggestions directly within your IDE. In its announcement blog post and YouTube video , the Codeium team shows how to build an app in VSCode with MongoDB serving as the data layer. Developers can ask questions on how to read and write to the database, get code competition suggestions, explore specific functions and syntax, handle errors, and more. This was all done at no cost using the MongoDB Atlas free tier and Codeium 100% free, forever individual plan. You can get started today by registering for MongoDB Atlas and then downloading the Codeium extension . If you are building your own AI app, sign up for the MongoDB AI Innovators Program . Successful applicants get access to free Atlas credits and technical enablement, as well as connections into the broader AI ecosystem.

February 21, 2024
Artificial Intelligence

Reducing Bias in Credit Scoring with Generative AI

This post is also available in: Deutsch , Français , Español , Português , Italiano , 한국어 , 简体中文 . Credit scoring plays a pivotal role in determining who gets access to credit and on what terms. Despite its importance, however, traditional credit scoring systems have long been plagued by a series of critical issues from biases and discrimination, to limited data consideration and scalability challenges. For example, a study of US loans showed that minority borrowers were charged higher interest rates (+8%) and rejected loans more often (+14%) than borrowers from more privileged groups. The rigid nature of credit systems means that they can be slow to adapt to changing economic landscapes and evolving consumer behaviors, leaving some individuals underserved and overlooked. To overcome this, banks and other lenders are looking to adopt artificial intelligence to develop increasingly sophisticated models for scoring credit risk. In this article, we'll explore the fundamentals of credit scoring, the challenges current systems present, and delve into how artificial intelligence (AI), in particular, generative AI (genAI) can be leveraged to mitigate bias and improve accuracy. From the incorporation of alternative data sources to the development of machine learning (ML) models, we'll uncover the transformative potential of AI in reshaping the future of credit scoring. Check out our AI resource page to learn more about building AI-powered apps with MongoDB. What is credit scoring? Credit scoring is an integral aspect of the financial landscape, serving as a numerical gauge of an individual's creditworthiness. This vital metric is employed by lenders to evaluate the potential risk associated with extending credit or lending money to individuals or businesses. Traditionally, banks rely on predefined rules and statistical models often built using linear regression or logistic regression. The models are based on historical credit data, focusing on factors such as payment history, credit utilization, and length of credit history. However, assessing new credit applicants poses a challenge, leading to the need for more accurate profiling. To cater to the underserved or unserved segments traditionally discriminated against, fintechs and digital banks are increasingly incorporating information beyond traditional credit history with alternative data to create a more comprehensive view of an individual's financial behavior. Challenges with traditional credit scoring Credit scores are integral to modern life because they serve as a crucial determinant in various financial transactions, including securing loans, renting an apartment, obtaining insurance, and even sometimes in employment screenings. Because the pursuit of credit can be a labyrinthine journey, here are some of the challenges or limitations with traditional credit scoring models that often cloud the path to credit application approval. Limited credit history: Many individuals, especially those new to the credit game, encounter a significant hurdle – limited or non-existent credit history. Traditional credit scoring models heavily rely on past credit behavior, making it difficult for individuals without a robust credit history to prove their creditworthiness. Roughly 45 million Americans lack credit scores simply because those data points do not exist for them. Inconsistent income: Irregular income, typical in part-time work or freelancing, poses a challenge for traditional credit scoring models, potentially labeling individuals as higher risk and leading to application denials or restrictive credit limits. In 2023 in the United States , data sources differ on how many people are self-employed. One source shows more than 27 million Americans filed Schedule C tax documents, which cover net income or loss from a business – highlighting the need for different methods of credit scoring for those self-employed. High utilization of existing credit: Heavy reliance on existing credit is often perceived as a signal of potential financial strain, influencing credit decisions. Credit applications may face rejection or approval with less favorable terms, reflecting concerns about the applicant's ability to judiciously manage additional credit. Lack of clarity in rejection reasons: Understanding the reasons behind rejections hinders applicants from addressing the root causes – in the UK, a study between April 2022 and April 2023 showed the main reasons for rejection included “poor credit history” (38%), “couldn’t afford the repayments” (28%), “having too much other credit" (19%) and 10% said they weren’t told why. The reasons even when given are often too vague which leaves applicants in the dark, making it difficult for them to address the root cause and enhance their creditworthiness for future applications. The lack of transparency is not only a trouble for customers, it can also lead to a penalty for banks. For example, a Berlin bank was fined €300k in 2023 for lacking transparency in declining a credit card application. Lack of flexibility: Shifts in consumer behavior, especially among younger generations preferring digital transactions, challenge traditional models. Factors like the rise of the gig economy, non-traditional employment, student loan debt, and high living costs complicate assessing income stability and financial health. Traditional credit risk predictions are limited during unprecedented disruptions like COVID-19, not taking this into account in scoring models. Recognizing these challenges highlights the need for alternative credit scoring models that can adapt to evolving financial behaviors, handle non-traditional data sources, and provide a more inclusive and accurate assessment of creditworthiness in today's dynamic financial landscape. Credit scoring with alternative data Alternative credit scoring refers to the use of non-traditional data sources (aka. alternative data) and methods to assess an individual's creditworthiness. While traditional credit scoring relies heavily on credit history from major credit bureaus, alternative credit scoring incorporates a broader range of factors to create a more comprehensive picture of a person's financial behavior. Below are some of the popular alternative data sources: Utility payments: Beyond credit history, consistent payments for utilities like electricity and water offer a powerful indicator of financial responsibility and reveal a commitment to meeting financial obligations, providing crucial insights beyond traditional metrics. Rental history: For those without a mortgage, rental payment history emerges as a key alternative data source. Demonstrating consistent and timely rent payments paints a comprehensive picture of financial discipline and reliability. Mobile phone usage patterns: The ubiquity of mobile phones unlocks a wealth of alternative data. Analyzing call and text patterns provides insights into an individual's network, stability, and social connections, contributing valuable information for credit assessments. Online shopping behavior: Examining the frequency, type, and amount spent on online purchases offers valuable insights into spending behaviors, contributing to a more nuanced understanding of financial habits. Educational and employment background: Alternative credit scoring considers an individual's educational and employment history. Positive indicators, such as educational achievements and stable employment, play a crucial role in assessing financial stability. These alternative data sources represent a shift towards a more inclusive, nuanced, and holistic approach to credit assessments. As financial technology continues to advance, leveraging these alternative data sets ensures a more comprehensive evaluation of creditworthiness, marking a transformative step in the evolution of credit scoring models. Alternative credit scoring with artificial intelligence Besides the use of alternative data, the use of AI as an alternative method has emerged as a transformative force to address the challenges of traditional credit scoring, for a number of reasons: Ability to mitigate bias: Like traditional statistical models, AI models, including LLMs, trained on historical data that are biased will inherit biases present in that data, leading to discriminatory outcomes. LLMs might focus on certain features more than others or may lack the ability to understand the broader context of an individual's financial situation leading to biased decision-making. However, there are various techniques to mitigate the bias of AI models: Mitigation strategies: Initiatives begin with the use of diverse and representative training data to avoid reinforcing existing biases. Inadequate or ineffective mitigation strategies can result in biased outcomes persisting in AI credit scoring models. Careful attention to the data collected and model development is crucial in mitigating this bias. Incorporating alternative data for credit scoring plays a critical role in reducing biases. Rigorous bias detection tools, fairness constraints, and regularization techniques during training enhance model accountability: Balancing feature representation and employing post-processing techniques and specialized algorithms contribute to bias mitigation. Inclusive model evaluation, continuous monitoring, and iterative improvement, coupled with adherence to ethical guidelines and governance practices, complete a multifaceted approach to reducing bias in AI models. This is particularly significant in addressing concerns related to demographic or socioeconomic biases that may be present in historical credit data. Regular bias audits: Conduct regular audits to identify and mitigate biases in LLMs. This may involve analyzing model outputs for disparities across demographic groups and adjusting the algorithms accordingly. Transparency and explainability: Increase transparency and explainability in LLMs to understand how decisions are made. This can help identify and address biased decision-making processes. Trade Ledger , a lending software as a service (SaaS) tool, uses a data-driven approach to make informed decisions with greater transparency and traceability by bringing data from multiple sources with different schemas into a single data source. Ability to analyze vast and diverse datasets: Unlike traditional models that rely on predefined rules and historical credit data, AI models can process a myriad of information, including non-traditional data sources, to create a more comprehensive assessment of an individual's creditworthiness, ensuring that a broader range of financial behaviors is considered. AI brings unparalleled adaptability to the table: As economic conditions change and consumer behaviors evolve, AI-powered models can quickly adjust and learn from new data. The continuous learning aspect ensures that credit scoring remains relevant and effective in the face of ever-changing financial landscapes. The most common objections from banks to not using AI in credit scoring are transparency and explainability in credit decisions. The inherent complexity of some AI models, especially deep learning algorithms, may lead to challenges in providing clear explanations for credit decisions. Fortunately, the transparency and interpretability of AI models have seen significant advancements. Techniques like SHapley Additive exPlanations (SHAP) values and Local Interpretable Model-Agnostic Explanations (LIME) plots and several other advancements in the domain of Explainable AI (XAI) now allow us to understand how the model arrives at specific credit decisions. This not only enhances trust in the credit scoring process but also addresses the common critique that AI models are "black boxes." Understanding the criticality of leveraging alternative data that often comes in a semi or unstructured format, financial institutions work with MongoDB to enhance their credit application processes with a faster, simpler, and more flexible way to make payments and offer credit: Amar Bank, Indonesia's leading digital bank , is combatting bias by providing microloans to people who wouldn’t be able to get financial services from traditional banks (unbanked and underserved). Traditional underwriting processes were inadequate for customers lacking credit history or collateral so they have streamlined lending decisions by harnessing unstructured data. Leveraging MongoDB Atlas, they developed a predictive analytics model integrating structured and unstructured data to assess borrower creditworthiness. MongoDB's scalability and capability to manage diverse data types were instrumental in expanding and optimizing their lending operations. For the vast majority of Indians, getting credit is typically challenging due to stringent regulations and a lack of credit data. Through the use of modern underwriting systems, Slice, a leading innovator in India’s fintech ecosystem , is helping broaden the accessibility to credit in India by streamlining their KYC process for a smoother credit experience. By utilizing MongoDB Atlas across different use cases, including as a real-time ML feature store, slice transformed their onboarding process, slashing processing times to under a minute. slice uses the real-time feature store with MongoDB and ML models to compute over 100 variables instantly, enabling credit eligibility determination in less than 30 seconds. Transforming credit scoring with generative AI Besides the use of alternative data and AI in credit scoring, GenAI has the potential to revolutionize credit scoring and assessment with its ability to create synthetic data and understand intricate patterns, offering a more nuanced, adaptive, and predictive approach. GenAI’s capability to synthesize diverse data sets addresses one of the key limitations of traditional credit scoring – the reliance on historical credit data. By creating synthetic data that mirrors real-world financial behaviors, GenAI models enable a more inclusive assessment of creditworthiness. This transformative shift promotes financial inclusivity, opening doors for a broader demographic to access credit opportunities. Adaptability plays a crucial role in navigating the dynamic nature of economic conditions and changing consumer behaviors. Unlike traditional models which struggle to adjust to unforeseen disruptions, GenAI’s ability to continuously learn and adapt ensures that credit scoring remains effective in real-time, offering a more resilient and responsive tool for assessing credit risk. In addition to its predictive prowess, GenAI can contribute to transparency and interpretability in credit scoring. Models can generate explanations for their decisions, providing clearer insights into credit assessments, and enhancing trust among consumers, regulators, and financial institutions. One key concern however in making use of GenAI is the problem of hallucination, where the model may present information that is either nonsensical or outright false. There are several techniques to mitigate this risk and one approach is using the Retrieval Augment Generation (RAG) approach. RAG minimizes hallucinations by grounding the model’s responses in factual information from up-to-date sources, ensuring the model’s responses reflect the most current and accurate information available. Patronus AI for example leverages RAG with MongoDB Atlas to enable engineers to score and benchmark large language models (LLMs) performance on real-world scenarios, generate adversarial test cases at scale, and monitor hallucinations and other unexpected and unsafe behavior. This can help to detect LLM mistakes at scale and deploy AI products safely and confidently. Another technology partner of MongoDB is Robust Intelligence . The firm’s AI Firewall protects LLMs in production by validating inputs and outputs in real-time. It assesses and mitigates operational risks such as hallucinations, ethical risks including model bias and toxic outputs, and security risks such as prompt injections and personally identifiable information (PII) extractions. As generative AI continues to mature, its integration into credit scoring and the broader credit application systems promises not just a technological advancement, but a fundamental transformation in how we evaluate and extend credit. A pivotal moment in the history of credit The convergence of alternative data, artificial intelligence, and generative AI is reshaping the foundations of credit scoring, marking a pivotal moment in the financial industry. The challenges of traditional models are being overcome through the adoption of alternative credit scoring methods, offering a more inclusive and nuanced assessment. Generative AI, while introducing the potential challenge of hallucination, represents the forefront of innovation, not only revolutionizing technological capabilities but fundamentally redefining how credit is evaluated, fostering a new era of financial inclusivity, efficiency, and fairness. If you would like to discover more about building AI-enriched applications with MongoDB, take a look at the following resources: Digitizing the lending and leasing experience with MongoDB Deliver AI-enriched apps with the right security controls in place, and at the scale and performance users expect Discover how slice enables credit approval in less than a minute for millions

February 20, 2024
Applied

Ready to get Started with MongoDB Atlas?

Start Free