Arthur

Arthur

Software Development

New York, New York 6,516 followers

The AI Performance Company

About us

The AI Performance Company. We work with enterprise teams to monitor, measure, and improve machine learning models for better results across accuracy, explainability, and fairness. We are deeply passionate about building technology to make AI work for everyone. Arthur is an equal opportunity employer and we believe strongly in "front-end ethics": building a sustainable company and industry where strong performance and a positive human impact are inextricably linked. We're hiring! Take a look at our open roles at arthur.ai/careers.

Website
https://arthur.ai/
Industry
Software Development
Company size
11-50 employees
Headquarters
New York, New York
Type
Privately Held
Founded
2018

Locations

Employees at Arthur

Updates

  • View organization page for Arthur, graphic

    6,516 followers

    Multimodal AI is a total game-changer, because it gives us... 👉 Deeper Understanding & Context: Human communication is inherently multimodal. We use words, gestures, facial expressions, and tone of voice to convey meaning. By mimicking this ability, multimodal AI can achieve a more profound understanding of context and nuance. 👉 Improved Performance of AI Systems: Integrating multiple data sources can improve the accuracy and robustness of AI systems. For example, in medical diagnostics, combining imaging data (like X-rays) with patient history and symptoms can lead to more accurate diagnoses. 👉 New Applications & Innovations: Multimodal AI opens the door to novel applications. Imagine virtual assistants that can not only understand your spoken instructions but also read your facial expressions and body language to better gauge your mood and intentions. Check out our blog post to learn more about the techniques, use cases, and challenges of multimodal AI: https://bit.ly/4bXKy0f

    • No alternative text description for this image
  • View organization page for Arthur, graphic

    6,516 followers

    Interested in learning more about multimodal embeddings and their applications? 🗣️💬🖼️ Join us and Nomic AI in a few weeks for a webinar session where we'll dive into key concepts at the intersection of embeddings and ML observability, a behind-the-scenes look at building and training a multimodal embedding model, and so much more. Save your spot: https://bit.ly/3RCq1Gt

    • No alternative text description for this image
  • View organization page for Arthur, graphic

    6,516 followers

    Interested in learning more about multimodal embeddings? Join Zach Nussbaum, Principal ML Engineer at Nomic AI, along with Arthur’s Chief Scientist John Dickerson for a session where they’ll discuss: 🔹 Key concepts at the intersection of embeddings and ML observability 🔹 Best practices for implementation 🔹 A behind-the-scenes look at building and training a multimodal embedding model 🔹 Use cases with multimodal RAG Learn more and register: https://bit.ly/3RCq1Gt

    • No alternative text description for this image
  • View organization page for Arthur, graphic

    6,516 followers

    Love to see Arthurians giving back to the local community! Shoutout to Selena, Teresa, Sabrina, and Madeleine who recently prepped and served dinner to neighbors in need at NYC’s Bowery Mission. 💜

    • No alternative text description for this image
  • View organization page for Arthur, graphic

    6,516 followers

    6️⃣ Tools for Getting Started with LLM Experimentation & Development 🛠️🧰 With the field of AI changing at such a rapid pace, it can feel nearly impossible to stay up to date with the latest tools and techniques. Here are a few that our ML Research Scientist Max Cembalest thinks are productive, innovative, and easy to use! 🧑🔬 For Experimentation: - LiteLLM (YC W23): A simple client API that makes it easy to test major LLM providers. It maintains enough of a common format for your LLM inputs for painless swapping between providers. - Ollama: A tool for experimenting with open-source models, with a git-like CLI to fetch all the latest models (at various levels of quantization so you can run quickly from a laptop) and prompt from the terminal. - MLX: Built specifically for Apple hardware, MLX brings massive improvements to the speed and memory-efficiency of running and training all the standard and state-of-the-art AI models on Apple devices. - DSPy: Designed to be analogous to PyTorch—every time the LLM, retriever, evaluation criteria, or anything else is modified, DSPy can re-optimize a new set of prompts and examples that max out your evaluation criteria. 📊 For Evaluation: - Elo: Traditionally used to rank chess players, the Elo rating system has been employed to compare the relative strengths of various AI language models based on votes from human evaluators. It has become a very popular and cost-effective general purpose metric to quantitatively rank LLMs from head-to-head blind A/B preference tests. - Arthur Bench: Last but not least, Bench is our open-source evaluation product for comparing LLMs, prompts, and hyperparameters for generative text models. It enables businesses to evaluate how different LLMs will perform in real-world scenarios so they can make informed, data-driven decisions when integrating the latest AI technologies into their operations. 

  • View organization page for Arthur, graphic

    6,516 followers

    2024 is the year of multimodal AI. 💬 🖼️ 🎥 🎤 AI systems are unlocking new applications and seeing improved performance by combining data types like text, image, video, audio. In our latest blog post, learn about multimodal AI techniques, business use cases, and why it’s poised to revolutionize the way we interact with technology: https://bit.ly/4bXKy0f

    • No alternative text description for this image
  • View organization page for Arthur, graphic

    6,516 followers

    Let’s talk LLM experimentation. 🧑🔬 One day, there may be a principled, scientific, and repeatable way to pick the right LLM and the right tools for any job. But until we have that, a level of flexibility and ad-hoc artistry is necessary to decide which patchwork of features is best suited to serve an application’s needs. So, in order to continue experimenting and ensure you’re getting the most value out of LLMs, it’s important to stay up to date on the latest tools and techniques to do so. In this comprehensive guide, we highlighted a number of projects in three categories: 🤳 Touchpoints: Quick, minimal LLM experimentation interfaces ⚖️ Evaluation: Metrics and relevant benchmark datasets 🪄 Enhancing Prompts: RAG, APIs, and well-chosen examples for your LLM to see how it’s done Check it out: https://bit.ly/4e5PPEr

    • No alternative text description for this image

Similar pages

Browse jobs

Funding