Page 42 - Red Hat PR REPORT - MAY-JUNE 2025
P. 42

5/27/25, 11:04 AM  Red Hat Unlocks Generative AI for Any Model and Any Accelerator Across the Hybrid Cloud with Red Hat AI Inference Server - Mi…
        Red Hat AI Inference Server packages the leading innovation of vLLM and forges it into the enterprise-grade capabilities of Red Hat AI
        Inference Server. Red Hat AI Inference Server is available as a standalone containerized offering or as part of both RHEL AI and Red Hat
        OpenShift AI.


        Across any deployment environment, Red Hat AI Inference Server provides users with a hardened, supported distribution of vLLM, along
        with:


          Intelligent LLM compression tools for dramatically reducing the size of both foundational and fine-tuned AI models, minimizing
          compute consumption while preserving and potentially enhancing model accuracy.
          Optimized model repository, hosted in the Red Hat AI organization on Hugging Face, offers instant access to a validated and optimized
          collection of leading AI models ready for inference deployment, helping to accelerate efficiency by 2-4x without compromising model
          accuracy.

          Red Hat’s enterprise support and decades of expertise in bringing community projects to production environments.
          Third-party support for even greater deployment flexibility, enabling Red Hat AI Inference Server to be deployed on non-Red Hat Linux
          and Kubernetes platforms pursuant to Red Hat’s third-party support policy.

        Red Hat’s vision: Any model, any accelerator, any cloud.


        The future of AI must be defined by limitless opportunity, not constrained by infrastructure silos. Red Hat sees a horizon where
        organizations can deploy any model, on any accelerator, across any cloud, delivering an exceptional, more consistent user experience
        without exorbitant costs. To unlock the true potential of gen AI investments, enterprises require a universal inference platform – a
        standard for more seamless, high-performance AI innovation, both today and in the years to come.

        Just as Red Hat pioneered the open enterprise by transforming Linux into the bedrock of modern IT, the company is now poised to
        architect the future of AI inference. vLLM’s potential is that of a linchpin for standardized gen AI inference, and Red Hat is committed to
        building a thriving ecosystem around not just the vLLM community but also llm-d for distributed inference at scale. The vision is clear:
        regardless of the AI model, the underlying accelerator or the deployment environment, Red Hat intends to make vLLM the definitive open
        standard for inference across the new hybrid cloud.

        Red Hat Summit:


        Join the Red Hat Summit keynotes to hear the latest from Red Hat executives, customers and partners:

          Modernized infrastructure meets enterprise-ready AI — Tuesday, May 20, 8-10 a.m. EDT (YouTube)

          Hybrid cloud evolves to deliver enterprise innovation — Wednesday, May 21, 8-9:30 a.m. EDT (YouTube)

        Supporting Quotes:


        Joe Fernandes, vice president and general manager, AI Business Unit, Red Hat

        “Inference is where the real promise of gen AI is delivered, where user interactions are met with fast, accurate responses delivered by a
        given model, but it must be delivered in an effective and cost-efficient way. Red Hat AI Inference Server is intended to meet the demand
        for high-performing, responsive inference at scale while keeping resource demands low, providing a common inference layer that
        supports any model, running on any accelerator in any environment.”

        Ramine Roane, corporate vice president, AI Product Management, AMD


        “In collaboration with Red Hat, AMD delivers out-of-the-box solutions to drive efficient generative AI in the enterprise. Red Hat AI
        Inference Server enabled on AMD Instinct™ GPUs equips organizations with enterprise-grade, community-driven AI inference capabilities
        backed by fully validated hardware accelerators.”


        Jeremy Foster, senior vice president and general manager, Cisco
      https://mid-east.info/red-hat-unlocks-generative-ai-for-any-model-and-any-accelerator-across-the-hybrid-cloud-with-red-hat-ai-inference-server/  2/3
   37   38   39   40   41   42   43   44   45   46   47