kuka
Atualizações Recentes
  • WWW.AIWIRE.NET
    Inside SASs Push to Make AI Agents Accountable
    At SAS Innovate 2025 in Orlando, SAS unveiled its roadmap for agentic AI, making the case for its role as a company that has been quietly working on intelligent decision automation long before AI agents became a trending topic. The latest enhancements to its SAS Viya platform aim to help enterprises design, deploy, and govern AI agents that combine automation with ethical oversight.While many tech vendors are racing to show off how many AI agents they can spin up at once, SAS CTO Bryan Harris dismisses such counts as a vanity metric. What really counts, he said, is not the quantity of agents but the quality of their output.The metric that matters, Harris told AIwire, is what kind of decisions you're running in the enterprise, and what's the value of those decisions to the business?How SAS Defines Agentic AIAgentic AI, as defined by SAS, is not simply about automating tasks, but is about building systems that make decisions with a blend of reasoning, analytics, and embedded governance. The SAS Viya platform supports this vision by integrating deterministic models, machine learning algorithms, and large language models into a unified orchestration layer. The goal is to enable enterprises to deploy intelligent agents that are capable of acting autonomously when appropriate but also provide transparency and human oversight when the stakes are high.SAS Innovate 2025. (Source: The Author)Udo Sglavo, VP of applied AI and modeling R&D, described SASs agentic push as a natural evolution from the company's consulting-driven past. Weve been doing this kind of modeling exercise for a long time, but typically it was a one-to-one relationship. You came to me with a problem, Id send in consultants, theyd solve it, off we go, Sglavo told AIwire. Now the idea is, if youve done this ten, a hundred times for the same kind of challenge, why not take all this IP and put it into a software product?This shift from services to scalable solutions, according to Sglavo, has been accelerated by growing comfort with LLMs. "Theres been a mindset change. Customers are now more willing to adopt models they didnt build themselves," he said. That shift has cleared the way for wider adoption of prepackaged models and agent-based systems.The Limits of Large Language ModelsBoth Harris and Sglavo emphasized that LLMs, despite their widespread appeal, are only one piece of a much larger enterprise AI picture. At SAS, LLMs are viewed as valuable but limited components that need to be paired with other forms of intelligence to drive reliable, repeatable decisions.The SAS executives explained that unlike deterministic models, which return consistent outputs for the same inputs every time, LLMs can be unpredictable. If I run a deterministic model with the same conditions a thousand times, Ill get the same answer a thousand times, Harris said. Thats not the case for large language models. This variability makes them ill-suited for high-stakes applications where auditability and control are critical. Instead, SAS uses LLMs where they excel: speeding up repetitive tasks and generating prototype solutions that humans or more deterministic systems can later refine.One example of repetitive task speedup is in schema mapping, a task that often requires domain knowledge and painstaking manual review. With metadata as input, LLMs can rapidly suggest column matches and generate code, reducing a multi-week effort to minutes. However, because accuracy can vary, SAS integrates confidence scoring and always includes a human-in-the-loop to validate results.In more advanced use cases, SAS has also implemented techniques that allow LLMs to iterate on their own outputs by revisiting earlier steps, rethinking mappings, and challenging initial assumptions. This iterative self-checking behavior is a key design principle in SAS's agentic AI framework, where agents do not just accept the first answer but reason through problems dynamically.Giving Agents a GoalThe key distinction SAS draws between traditional automation and agentic AI lies in goal orientation. Rather than simply executing a set of predefined instructions, agents are designed to pursue a defined goal and adjust their behavior dynamically until that goal is met. This capability reflects a shift in how organizations are thinking about AI, driven in part by the disillusionment that followed early enthusiasm around LLMs.Udo Sglavo, SAS VP of Applied AI and Modeling R&DSglavo explained in an interview how many business leaders initially hoped that generative models would offer a kind of universal intelligence where you could drop in a business problem and get out a solution. Instead, LLMs proved best suited for narrow tasks like text analysis. The emergence of agentic AI, he said, represents an effort to combine the statistical, machine learning, and optimization techniques developed over decades with the newer capabilities of LLMs and retrieval-augmented knowledge systems.In this framework, agents become orchestrators of those tools. Rather than being explicitly programmed for each step, they are handed an objective, such as increasing event registration numbers, and are then tasked with deciding how to achieve it. For example, an agent could generate emails, identify potential recipients using a statistical model, and continue refining its campaign until a defined target is reached.This kind of agent, Sglavo noted, is well-suited for low-risk scenarios like marketing campaigns. But when the stakes are higher, such as decisions about credit approvals or healthcare outcomes, the approach must shift. Human-in-the-loop oversight becomes essential, and clear governance frameworks must define where autonomy ends and accountability begins.Governance and Trust at the CoreThe SAS executives stressed that agentic AI cannot be responsibly deployed without built-in governance. SAS Viya includes mechanisms to detect bias, evaluate fairness, and provide full transparency into how decisions are made. "We give our customers insight into when a model is deficient," said Harris. " And then they can make the choice to improve the data or improve the model.(Source: Suri_Studio/Shutterstock)Governance also includes controls over how much autonomy agents are granted. This is especially critical in high-risk domains like finance, healthcare, and public services. SAS includes guardrails that ensure transparency and lets customers fine-tune how much autonomy agents are allowed.SAS also emphasizes the importance of localized knowledge sources. Rather than relying on internet-sourced information, agents can be configured to draw only from enterprise-specific data repositories. Retrieval-augmented generation (RAG) setups enable agents to access internal knowledge bases to make contextual decisions without compromising security or accuracy.A Marketplace of Agents Is ComingLooking ahead, Sglavo expects agentic AI to evolve into an open marketplace, where enterprises can mix and match specialized agents from different vendors. In that future, decision-making will be distributed across interconnected agent networks that communicate and collaborate using shared protocols like MCP or Google's open source A2A. This vision also redefines how enterprises think about deployment. Rather than shipping massive monolithic AI systems, companies will deploy nimble agents, each with a narrow focus but deep specialization.This will become the marketplace of agents, Sglavo said. Because while we may say we have the best supply chain optimization agent, another vendor may claim the same thing. And then it becomes a question of trust, pricing, track record. Have they done this before? Are they just a startup thats good at tech but hasnt worked with actual customers?Sglavo added that enterprises will want the flexibility to select and combine agents based on their needs. Youll say, I want to use this agent, this one, and this oneand just bring them all together.A Future Built on Accountable AIBryan Harris, CTO at SASAs generative AI continues to capture headlines, SAS is placing its bet on decision-first AI. For companies in regulated sectors where the cost of a bad decision can be measured in lives or billions, the company argues, transparency and trust must come before experimentation or scale.As the enterprise AI conversation shifts from experimental prototypes to more practical, accountable systems, SAS is staking out a space where trust, interoperability, and decision quality come first."You can't prevent irresponsibility," said Harris. "But we can give you the tools that allow you to make the right decision."
    0 Comentários 0 Compartilhamentos 4 Visualizações
  • WWW.AIWIRE.NET
    Colossus AI Hits 200,000 GPUs as Musk Ramps Up AI Ambitions
    Elon Musks Colossus AI infrastructure, said to be one of the most powerful AI computing clusters in the world, has just reached full operational capacity. Designed to push the boundaries of AI, this massive computing system now consists of 200,000 GPUs, all running on Tesla Megapack batteries. This is a significant milestone in Musks growing push into AI.With the on-site substation going online and connecting to the main power grid, phase 1 of Colossus AI infrastructure, located in Memphis, TN, is now complete. The supercomputer isnow running at 150 MW from the grid, according to the Greater Memphis Chamber. The additional 150-megawatt Megapack battery system will act as a backup power source, ensuring continued operation during outages or periods of heightened electricity demand.Colossus AI is the flagship product of Musks official AI company, xAI. The supercomputer was first activated in July last year with 100,000 Nvidia GPUs, after being built at an astonishing pace. The entire project was completed in 122 days, while the hardware installation to training phase took only 19 days. The pace of the project impressed Nvidia CEO Jensen Huang, who pointed out that projects of this scale typically take around four years, making its deployment remarkably fast.As far as I know, theres only one person in the world who could do that, said Huang. Elon is singular in his understanding of engineering and construction and large systems and marshaling resources; its just unbelievable.However, the speed came at a cost, as the facility initially lacked a direct connection to the power grid. To keep operations running, the site depended on natural gas turbine generators for electricity, raising concerns about emissions and sustainability.Early reports suggested 14 turbines were supplying power, each generating 2.5 MW, but observations from residents indicated the number may haveexceeded 35 in the surrounding area. That is more than twice the permitted limit. This reliance on temporary power sources had sparked discussions about the long-term energy plan for the facility, especially as xAI looks to scale up operations further.(Source: sdx15/Shutterstock)Adding more GPUs to the infrastructure means that the AI cluster can now rely more on grid power rather than gas-powered generators. This will help improve efficiency and address environmental concerns. Reportedly, xAI plans to remove half the temporary generators by the end of the summer.The other half of the temporary generators will have to remain to deliver the electrical needs of the second phase of the Memphis Supercluster.Musk plans to double the capacity of Colossus AI before the end of this year. Another 150 MW is going to be added, taking the total capacity to 300 MW. This translates to powering 300,000 homes. Its not surprising that this massive power demand has sparked concerns about whether the Tennessee Valley Authority (TVA) has sufficient capacity to support it.xAI has publicly stated plans to expand its Colossus supercomputer to over 1 million GPUs.For the local economy, Colossus AI promises economic development and infrastructure investment. However, concerns persist regarding disruptions to power quality for residents and the projects environmental impact.You dont become the moniker for technological innovation because someone comes in and exploits your natural resources, your water, exploits the loopholes that allow them to pollute the air, said KeShaun Pearson, the director of the grassroots organization Memphis Community Against Pollution (MCAP). Thats not what makes you a technological city. That spin is dangerous because it opens our city up for exploitation even further.The road to powering a million GPUs started when Musk founded xAI in July 2023. The stated goal of understanding the true nature of the universe. In more practical terms, Musk wanted an AI lab under his own direction, free from the influences of Microsoft, Google, or other major tech firms.The company is an answer to the growing dominance of OpenAI (which now has Microsoft as a close partner) and Googles DeepMind. xAI is also integrated with Musks other ventures, including SpaceX and Tesla. With Colossus now operating at full capacity, xAI is positioned to accelerate the development and deployment of AI across Musks broader ecosystem.This article first appeared on BigDATAwire.
    0 Comentários 0 Compartilhamentos 60 Visualizações
  • WWW.AIWIRE.NET
    Redefining AI: From Suspect to Solution in Building a Sustainable Future
    Initially met with doubt and even apprehension, artificial intelligence (AI) has faced challenges in struggling to lose its reputation as a potential threat. Now, as generative AI continues to make significant advancements, the technology that once seemed distant is rapidly becoming a reality. Yet, the image of AI as a threat persists.This negative portrayal influences public opinion, shaping perceptions of AI as an unseen force threatening our livelihoods or as a resource-hungry entity causing environmental harm in data centers. However, AI is not a monstrous entity to fear; it is a transformative technology with the potential to drive progress, particularly in the field of energy.So, why is AI struggling to shake off any negative reputation? What does the future look like as we move into a new era of energy? And what will AIs role be in the electrification and digitalization of the world?The Looming Figure of AILets start by addressing some common concerns raised in the media. While AI offers immense benefits, much of the trepidation stems from a lack of understanding. A 2024 Ipsos report found that while half of respondents feel nervous about AI, only the same proportion actually know what products and services rely on it.Widely reported concerns on AI refer to energy consumption, carbon footprint, and operational costs. Looking at ChatGPT for example, studies estimate that the large language model consumes around 2.9 watt-hours per search, almost ten times the amount of energy needed for a standard Google search. Other stories detail AIs water consumption, with researchers at the University of California Riverside reporting that a 100-word email generated by an AI chatbot consumes more than 500ml of water.(Source: Shutterstock)While AI may require more resources than other technologies, innovation is making it increasingly energy efficient. Researchers are currently developing specialized hardware, such as 3D chips, which significantly enhance performance while reducing energy consumption. For instance, Nvidia, a leading chip manufacturer, claims its GB200 "superchip" delivers a 30-fold increase in performance for generative AI while using 25 times less energy.More importantly, AI plays a crucial role in sustainability by actually helping reduce energy and water consumption. When integrated into energy management software, AI enables businesses to identify inefficiencies, optimize resource use, and lower energy costs, emissions, and overall consumption. This not only improves ESG reporting scores but also supports national and global sustainability goals.AI is essential for the future, and with continued advancements, it does not have to come at the expense of environmental responsibility.The Age of Electricity 4.0As we look to the road ahead and our ambitious climate targets, energy is the key area that requires transformation. With 80% of global carbon emissions coming from the production and consumption of energy, decarbonizing energy is the key to net zero. Luckily, by using existing technologies to decarbonize, we could reduce 70% of CO2 emissions and save 10-15Gt CO2 annually and this is where we move into the era of Electricity 4.0.This era is categorized by wide-scale electrification and digitalization of energy and infrastructure, evolving energy from the biggest driver of carbon emissions to the biggest opportunity for carbon reduction. Electrification makes energy more sustainable, moving away from fossil fuels in favor of an increasing share of renewables. Digitalization means making energy data more visible and integrating more energy automation, allowing leaders to boost efficiency and make substantial consumption and cost savings. The two together will be crucial to power a more sustainable and resilient world. Going one step further to make Electricity 4.0 the most effective it can be, AI will prove itself as a key tool to turbocharge this cleaner, more efficient energy future.AI Dials Up the Positives(Source: Shutterstock)We must reframe AI as a key enabler of a more electric, digital, and sustainable future. As a catalyst for Electricity 4.0, AI enhances smarter, faster, and more precise decision-making through real-time monitoring and big data analysis. This enables businesses to better manage on-site energy storage, smooth peak consumption, and reduce reliance on fossil fuels in ways previously unattainable. Our North American R&D hub in Boston is a prime example, featuring an advanced microgrid with 1,379 solar modules and photovoltaic inverters for on-site power generation. By leveraging AI and cloud-based analytics through EcoStruxure Microgrid Advisor, the hub optimizes energy performance across solar, energy storage, and EV charging, generating over 520,000 kWh annually, equivalent to removing the annual greenhouse gas emissions of more than 2,400 cars.Additionally, AI-driven solutions can optimize energy use in residential spaces. On the basic level, this allows users to dynamically adjust lighting and heating based on occupancy patterns, significantly reducing energy waste. This technology can also power the future of prosumer whereby users can generate, store, and manage their own renewable energy. This transformation therefore streamlines operations, cuts costs, and accelerates sustainability efforts.Powering the Future With AIBy optimizing energy systems, improving efficiency, and driving innovation across industries, AI will play a central role in reducing emissions and advancing renewable energy solutions. Far from being a villain, AI's potential to revolutionize climate action makes it one of the most powerful allies we have in the fight against climate change.A more sustainable future, one rooted in clean energy and responsible consumption, is not just possible, but is undeniably AI-powered.About the AuthorFrdricGodemel is the Executive Vice-President for Energy Management and a member of Schneider Electrics Executive Committee, effective January 2025. Prior to this role, Frdric was the Executive Vice President for Power Systems and Services, acting as a strong advocate for electrification and decarbonization, often representing Schneider Electric at high-profile speaking engagements. He joined Schneider in 1990 and developed his international career around the power domain, spanning across low and medium voltage, energy automation, infrastructure and services. Over the years, he has held various global and operational leadership roles based in China, the UAE, and France. Frdric holds a degree in engineering from Ecole Centrale de Nantes (France) and an MBA from ESSEC (France).
    0 Comentários 0 Compartilhamentos 41 Visualizações
  • WWW.AIWIRE.NET
    Claudes Moral Map: Anthropic Tests AI Alignment in the Wild
    Claude, the AI chatbot developed by Anthropic, might be more than just helpful: It may have a sense of right and wrong. A new study analyzing over 300,000 user interactions reveals that Claude expresses a surprisingly coherent set of human-like values. The company released its new AI alignment research in a preprint paper titled Values in the wild: Discovering and analyzing values in real-world language model interactions.Anthropic has trained Claude to be helpful, honest, and harmless using techniques like Constitutional AI, but this study marks the companys first large-scale attempt to test whether those values hold up under real-world pressure.The company says it began the research with a sample of 700,000 anonymized conversations that users had on Claude.ai Free and Pro during one week of February 2025 (the majority of which were with Claude 3.5 Sonnet). It then filtered out conversations that were purely factual or unlikely to include dialogue concerning values in order to restrict analysis to subjective conversations only. This left 308,210 conversations for analysis.Claudes responses reflected a wide range of human-like values, which Anthropic grouped into five top-level categories: Practical, Epistemic, Social, Protective, and Personal. The most commonly expressed values included professionalism, clarity, and transparency. These values were further broken down into subcategories like critical thinking and technical excellence, offering a detailed look at how Claude prioritizes behavior across different contexts.Anthropic says Claude generally lived up to its helpful, honest, and harmless ideals: These initial results show that Claude is broadly living up to our prosocial aspirations, expressing values like user enablement (for helpful), epistemic humility (for honest), and patient wellbeing (for harmless), the company said in a blog post.Claude also showed it can express values opposite to what it was trained for, including dominance and amorality. Anthropic says these deviations were likely due to jailbreaks, or conversations that bypass the models behavioral guidelines. This might sound concerning, but in fact it represents an opportunity: Our methods could potentially be used to spot when these jailbreaks are occurring and thus help to patch them, the company said.One fascinating insight gleaned from this study is that Claudes values are not static and can shift depending on the situation, much like a humans set of values might. When users ask for romantic advice, Claude tends to emphasize healthy boundaries and mutual respect. In contrast, when analyzing controversial historical events, it leans on historical accuracy.Anthropic's overall approach, using language models to extract AI values and other features from real-world (but anonymized) conversations, taxonomizing and analyzing them to show how values manifest in different contexts. (Source: Anthropic)Anthropic also found that Claude frequently mirrors users values: We found that, when a user expresses certain values, the model is disproportionately likely to mirror those values: for example, repeating back the values of authenticity when this is brought up by the user, the company said. In more than a quarter of conversations (28.2%), Claude strongly reinforced the users own expressed values. Sometimes this mirroring makes the assistant seem empathetic, but at other times, it edges into what Anthropic calls pure sycophancy, noting that these results leave questions about which is which.Notably, Claude does not always go along with the user. In a small number of cases (3%), the model pushed back, typically when users asked for unethical content or shared morally questionable beliefs. This resistance, researchers suggest, might reflect Claudes most deeply ingrained values, surfacing only when the model is forced to make a stand. These kinds of contextual shifts would be hard to capture through traditional, static testing. But by analyzing Claudes behavior in the wild, Anthropic was able to observe how the model prioritizes different values in response to real human input, revealing not just what Claude believes but when and why those values emerge.(Source: Nadia Snopek/Shutterstock)As AI systems like Claude become more integrated into daily life, it is increasingly important to understand how they make decisions and which values guide those decisions. Anthropics study offers not only a snapshot of Claudes behavior but also a new method for tracking AI values at scale. The team has also made the studys dataset publicly available for others to explore.Anthropic notes that its approach comes with limitations. Determining what counts as a "value" is subjective, and some responses may have been oversimplified or placed into categories that do not quite fit. Because Claude was also used to help classify the data, there may be some bias toward finding values that align with its own training. The method also cannot be used before a model is deployed, since it depends on large volumes of real-world conversations.Still, that may be what makes it useful. By focusing on how an AI behaves in actual use, this approach could help identify issues that might not otherwise surface during pre-deployment evaluations, including subtle jailbreaks or shifting behavior over time. As AI becomes a more regular part of how people seek advice, support, or information, this kind of transparency could be a valuable check on how well models are living up to their goals.
    0 Comentários 0 Compartilhamentos 39 Visualizações
  • WWW.AIWIRE.NET
    Lenovo Storage Portfolio Refresh Aims to Speed Up AI Inference
    So far, 2025 has been the year of agentic AI and real-time LLM deployment. But another piece of the AI stack is coming into sharper focus: storage.As enterprises move from experimentation to real-world deployment, they are rethinking how infrastructure supports inference at scale. Tasks like feeding large language models with high-speed data, running retrieval-augmented generation (RAG) workflows, and managing hybrid cloud environments all depend on fast, efficient, and scalable storage systems. Real-time inference can strain bandwidth, increase latency, and reveal the limits of legacy infrastructure. Lenovo is responding with what it calls the largest storage portfolio refresh in its history, aimed at improving data throughput, reducing power demands, and simplifying deployment across hybrid environments.Among the key additions in Lenovos portfolio are new AI Starter Kits that combine compute, storage, and networking in pre-validated configurations for RAG and inferencing workloads. These kits include features like autonomous ransomware protection, encryption, and failover capabilities, with an emphasis on reducing integration complexity for IT teams.The company is also introducing what it describes as the industry's first liquid-cooled hyperconverged infrastructure appliance. This "GPT-in-a-box" system, part of the ThinkAgile HX series, uses Lenovo Neptune liquid cooling to support high-density inference workloads while reducing energy consumption by up to 25 percent compared to previous-generation systems.Lenovo says its new ThinkSystem Storage Arrays offer performance gains of up to three times over the previous generation, along with power and density improvements that aim to shrink datacenter footprints. The company claims these systems can deliver up to 97 percent energy savings and 99 percent greater storage density when replacing legacy hard drive-based systems.Other updates include the ThinkAgile SDI V4 Series, which uses a software-defined approach to combine compute and storage resources for containerized and virtualized AI workloads. Lenovo claims up to 2.4 times faster inference performance for large language models, as well as gains in IOPS and transaction rates.Scott Tease, VP and general manager of Lenovos Infrastructure Solutions Product Group, said the new storage offerings are aimed at helping businesses scale AI more effectively:The new Lenovo Data Storage Solutions help businesses harness AIs transformative power with a data-driven strategy that ensures scalability, interoperability, and tangible business outcomes powered by trusted infrastructure. The new solutions help customers achieve faster time to value no matter where they are on their IT modernization journey with turnkey AI solutions that mitigate risk and simplify deployment.One of the early adopters of Lenovos new storage offerings is OneNet, a provider of private cloud services. The company is using Lenovos infrastructure to improve both performance and energy efficiency in its datacenters.Innovation is embedded in OneNets DNA and partnering with Lenovo represents a commitment to modernizing the data center with cutting-edge solutions that drive efficiency and sustainability, said Tony Weston, CTO at OneNet. Backed by Lenovo solutions and Lenovo Premier Support, OneNet can deliver high-availability, high-performance private cloud services that our customers can depend on.With this portfolio update, Lenovo is positioning itself as a key infrastructure provider for enterprises looking to scale AI workloads without overhauling their entire stack. As inferencing and retrieval-based models become standard in production environments, vendors across the ecosystem are under pressure to make storage smarter, faster, and more adaptable.
    0 Comentários 0 Compartilhamentos 39 Visualizações
  • WWW.AIWIRE.NET
    Its Time to Get Comfortable with Uncertainty in AI Model Training
    Its obvious when a dog has been poorly trained. It doesnt respond properly to commands, pushes boundaries, and behaves unpredictably.The same is true with a poorly trained artificial intelligence (AI) model. Only with AI, its not always easy to identify what went wrong with the training.Research scientists globally are working with a variety of AI models trained on experimental and theoretical data. The goal is to predict a materials properties before creating and testing it. They are using AI to design better medicines and industrial chemicals in a fraction of the time it takes for experimental trial and error.But how can they trust the answers that AI models provide? Its not just an academic question. Millions of investment dollars can ride on whether AI model predictions are reliable.A research team from the Department of EnergysPacific Northwest National Laboratoryhas developed a method to determine how well a class of AI models called neural network potentials has been trained. Further, it can identify when a prediction is outside the boundaries of its training and where it needs more training to improvea process called active learning.The research team, led by PNNL data scientistsJenna BilbreyPope andSutanay Choudhury, describes how the new uncertainty quantification method worksin a research article published in NPJ Computational Materials.A dog that has been poorly trained is like an AI model that has been poorly trained. It doesnt know its boundaries. (Source: Jaromir Chalabala/Shutterstock) [Editors Note: But hes a good boy.]The research team, led by PNNL data scientistsJenna BilbreyPope andSutanay Choudhury, describes how the new uncertainty quantification method worksin a research article published in NPJ Computational Materials. The team is also makingthe method publicly available on GitHubas part of its larger repository, Scalable Neural Network Atomic Potentials (SNAP), to anyone who wants to apply it to their own work.We noticed that some uncertainty models tend to be overconfident, even when the actualerror in predictionis high, said Bilbrey Pope. This is common for most deep neural networks. However, a model trained with SNAP gives a metric that mitigates this overconfidence. Ideally, youd want to look atboth prediction uncertaintyand training data uncertainty to assess your overall model performance.Instilling trust in AI model training to speed discoveryResearch scientists want to take advantage of AIs speed of predictions, but right now, theres a tradeoff between speed and accuracy. An AI model can make predictions in seconds that might take a supercomputer 12 hours to compute using traditional computationally intensive methods. However, chemists and materials scientists still see AI as a black box.The PNNL data science teams uncertainty measurement provides a way to understand how much they should trust an AI prediction.AI should be able to accurately detect its knowledge boundaries, said Choudhury. We want our AI models to come with a confidence guarantee. We want to be able to make statements such as This prediction provides 85% confidence that catalyst A is better than catalyst B, based on your requirements.'In their published study, the researcherschose to benchmarktheir uncertainty method with one of the most advanced foundation models for atomistic materials chemistry, called MACE. The researchers calculated how well the model is trained to calculate the energy of specific families of materials. These calculations areimportantto understanding how well the AI model can approximate the more time- and energy-intensive methods that run on supercomputers. The results show what kinds of simulations can be calculated with confidence that the answers are accurate.This kind of trust and confidence in predictions is crucial to realizing the potential of incorporating AI workflows into everyday laboratory work and the creation of autonomous laboratories where AI becomes a trusted lab assistant, the researchers added.We have worked to make it possible to wrap any neural network potentials for chemistry into our framework, said Choudhury. Then in a SNAP, they suddenly have the power of being uncertainty aware.Now, if only puppies could be trained in a snap.In addition to Bilbrey and Choudhury, PNNL data scientistsJesun S. FirozandMal-Soon Leecontributed to the study. This work was supported by theTransferring exascale computational chemistry to cloud computing environment and emerging hardware technologies (TEC4) project, which is funded by the DOE Office of Science, Office of Basic Energy Sciences.About PNNLPacific Northwest National Laboratorydraws on its distinguishing strengths in chemistry, Earth sciences, biology and data science to advance scientific knowledge and address challenges inenergy resiliency and national security.Founded in 1965, PNNL is operated by Battelle and supported by the Office of Science of the U.S. Department of Energy. The Office of Science is the single largest supporter of basic research in the physical sciences in the United States and is working to address some of the most pressing challenges of our time. For more information, visit theDOE Office of Science website.For more information on PNNL, visitPNNLs News Center. Follow us onTwitter,Facebook,LinkedInandInstagram.Note: This article was initially posted on thePNNL News Siteand is reproduced here with permission.Karyn Hedeis aSenior Science Communicator and Media Relations Advisor at PNNL.
    0 Comentários 0 Compartilhamentos 46 Visualizações
  • WWW.AIWIRE.NET
    Parallel Works Unveils ACTIVATE High Security Platform, Secures Key DoD Accreditation
    Parallel Works has just launched its new ACTIVATE High Security Platform, a hybrid multi-cloud computing control plane designed to meet the Department of Defenses (DoD) stringent security needs.The platform, which has achieved Impact Level 5 (IL5) Provisional Authorization from the Defense Information Systems Agency (DISA), offers DoD agencies and contractors a faster, more affordable path to securely running critical HPC and AI workloads across public and private cloud environments.The launch positions Parallel Works among a small group of providers authorized to handle Controlled Unclassified Information (CUI) and International Traffic in Arms Regulations (ITAR) data, a category that includes tech giants such as AWS, Google, and Oracle. It also marks a rare entry for a specialized control plane vendor focused specifically on hybrid HPC and AI workloads.ACTIVATE HSP addresses a major bottleneck in the defense sector: the slow and costly security accreditation process that has prevented wider adoption of cloud computing. By providing a pre-accredited IL5 environment, the new platform will allow agencies and contractors to bypass months of security work and begin running sensitive workloads with minimal setup.Meeting an Overdue Need in Defense Cloud AdoptionDefense organizations are turning to cloud computing to meet modernization goals such as scaling compute capacity quickly, accelerating workload deployment, and gaining faster access to high-performance AI and simulation tools. These priorities are central to initiatives like the Joint Warfighter Cloud Capability (JWCC) program. However, actually using cloud services for classified or sensitive projects has remained difficult.(Source: Parallel Works)In an interview, Parallel Works CEO Matthew Shaxted told AIwire that major cloud providers like AWS, Azure, Google, and Oracle would probably agree that the DoDand the defense industrial base in generalisnt leveraging cloud to its full potential. The accreditation barrier is impeding wide adoption. It forms a valley of death for using cloud in the DoD.That challenge remains even after initial cloud accounts are set up. Even when cloud providers like AWS or Azure receive blanket security authorizations, end users still face a steep climb out of this valley of death. Setting up an account is relatively easy, but preparing it for real work requires configuring it to meet IL5 or similar standards, passing audits, and obtaining an Authority to Operate (ATO), which typically requires eighteen months or more, plus significant financial and labor investments.Parallel Works designed ACTIVATE HSP specifically to eliminate these hurdles. Customers can operate within a fully accredited IL5 environment without taking on the overhead of building and certifying their own. The platform provides immediate access to IL5-compliant cloud infrastructure while allowing organizations to retain total control over their data, workloads, and systems.What ACTIVATE HSP DoesFounded in 2015 and spun out of Argonne National Laboratory, Parallel Works developed its original ACTIVATE platform to simplify the management of complex hybrid computing environments. Initially designed to streamline batch scheduler systems, ACTIVATE has evolved into a unified control plane that connects users to many different cloud and on-premises HPC and AI resources.(Source: Parallel Works)ACTIVATE HSP builds on this foundation, offering a high-security version that supports deployment across AWS GovCloud, with support for Azure Government and Google Cloud Platform expected in late summer. Users can burst workloads across multiple cloud providers while maintaining strict IL5 compliance. The platform also links to Defense Supercomputing Resource Centers (DSRCs), so users can combine local HPC systems with cloud-based resources.Parallel Works also built real-time cost controls into ACTIVATE HSP. Instead of waiting for delayed billing reports from cloud providers, the platform monitors spending every three minutes and can enforce budget limits immediately. This helps prevent the budget overruns that have historically plagued government cloud projects due to billing cycle delays.In the interview, Shaxted explained that organizations have two options for using ACTIVATE HSP. Some, like the DoDs High Performance Computing Modernization Program (HPCMP), are offering it as a shared service for other agencies. Others can inherit the accredited package to set up their own dedicated environments, cutting the time needed to achieve operational approval.Why IL5 Accreditation MattersImpact Level 5 is the DoDs standard for systems that manage sensitive national security information but remain connected to the public internet. Achieving IL5 compliance requires meeting a long list of security controls, drawn from FedRAMP High standards with additional DoD-specific requirements.Parallel Works spent nearly three years on the accreditation process. The effort involved satisfying more than 400 security controls, compiling a documentation package exceeding 2,000 pages, and undergoing third-party and DISA audits.IL5 authorization allows users to store, process, and transmit CUI and ITAR-controlled data without requiring isolated, air-gapped environments. This distinction is critical because it enables more flexible computing architectures while still satisfying mission security requirements. By inheriting a fully audited environment, users can avoid the time and expense typically required to secure their own ATO. In many cases, projects that would have needed 18 months of security work can begin operating in a matter of days or weeks.What ACTIVATE HSP Means for DoD Project TeamsThe benefits of the ACTIVATE High Security Platform are practical: reduced costs, flexible access to secure compute, and deployment timelines measured in days, not months. Take, for example, a DoD project team preparing to run AI-enabled image analysis workflows on satellite data. Without a pre-authorized environment, the team would face months of system hardening, documentation, security reviews, and risk management evaluations, often before uploading a single dataset. These delays can stall mission timelines and slow the delivery of critical insights to decision-makers.With ACTIVATE HSP, the same team could inherit a compliant environment, configure resources through a familiar interface, and begin training models within days. They can also scale up as needed across cloud and on-premises infrastructure, without being limited by local capacity or held back by compliance barriers.The cost difference is just as stark. Building an IL5-compliant system independently can run into the millions, factoring in consulting fees, security tooling, and internal labor. By inheriting Parallel Works existing accreditation package, users reduce that overhead to the baseline costs of platform access and cloud consumption.Expanding Access to Secure Cloud InfrastructureThe launch of ACTIVATE HSP comes at a time when the DoD is facing growing pressure to modernize its digital infrastructure. As AI, digital twins, and autonomous systems become more central to defense operations, the demand for computing environments that combine performance with strong security continues to rise.Parallel Works CEO Matthew ShaxtedIn the near term, Parallel Works is focused on onboarding new users through shared environments such as the HPCMPs implementation. Support for Azure Government and Google Cloud is expected later this summer, giving defense teams the ability to choose the cloud providers that best meet their needs.One may wonder, how did a small company end up building one of the only IL5-authorized control planes available to the DoD? The origin of ACTIVATE HSP traces back to a competitive solicitation issued by the Defense Innovation Unit (DIU) three years ago. DIU, which helps adapt commercial technology for military use, had called for hybrid computing solutions that could address emerging security and scalability demands. Parallel Works original ACTIVATE platform aligned with the technical vision but lacked the high-security capabilities needed for sensitive workloads.Over the next three years, the company worked closely with defense stakeholders to develop, test, and certify what became ACTIVATE HSP. They really saw the need first, Shaxted said. We spent the last three years making it real.Parallel Works executives will present the platform next week at Special Operations Forces (SOF) Week in Tampa, Florida, May 5-8, 2025, where they will meet with prospective users from across the defense and national security ecosystem. More information about the platform is available at this link.
    0 Comentários 0 Compartilhamentos 45 Visualizações
  • WWW.AIWIRE.NET
    Three Ways AI Can Weaken Your Cybersecurity
    Even before generative AI arrived on the scene, companies struggled to adequately secure their data, applications, and networks. In the never-ending cat-and-mouse game between the good guys and the bad guys, the bad guys win their share of battles. However, the arrival of GenAI brings new cybersecurity threats, and adapting to them is the only hope for survival.Theres a wide variety of ways that AI and machine learning interact with cybersecurity, some of them good and some of them bad. But in terms of whats new to the game, there are three patterns that stand out and deserve particular attention, including slopsquatting, prompt injection, and data poisoning.SlopsquattingSlopsquatting is a fresh AI take on typosquatting, where neer-do-wells spread malware to unsuspecting Web travelers who happen to mistype a URL. With slopsquatting, the bad guys are spreading malware through software development libraries that have been hallucinated by GenAI.Slopsquatting is a new way to compromise AI systems. (Source: flightofdeath/shutterstock)We know that large language models (LLMs) are prone to hallucinations. The tendency to create things out of whole cloth is not so much a bug of LLMs, but a feature thats intrinsic to the way LLMs are developed. Some of these confabulations are humorous, but others can be serious. Slopsquatting falls into the latter category.Large companies have reportedly recommended Pythonic libraries that have been hallucinated by GenAI. Ina recent story inThe Register, Bar Lanyado, security researcher atLasso Security, explained that Alibaba recommended users install a fake version of the legitimate library called huggingface-cli.While it is still unclear whether the bad guys have weaponized slopsquatting yet, GenAIs tendency to hallucinate software libraries is perfectly clear. Last month, researchers published a paper that concluded that GenAI recommends Python and JavaScript libraries that dont exist about one-fifth of the time.Our findings reveal that that the average percentage of hallucinated packages is at least 5.2% for commercial models and 21.7% for open-source models, including a staggering 205,474 unique examples of hallucinated package names, further underscoring the severity and pervasiveness of this threat, the researchers wrote in the paper, titledWe Have a Package for You! A Comprehensive Analysis of Package Hallucinations by Code Generating LLMs.Out of the 205,00+ instances of package hallucination, the names appeared to be inspired by real packages 38% of the time, were the results of typos 13% of the time, and were completely fabricated 51% of the time.Prompt InjectionJust when you thought it was safe to venture onto the Web, a new threat emerged: prompt injection.Like the SQL injection attacks that plagued early Web 2.0 warriors who didnt adequately validate database input fields, prompt injections involve the surreptitious injection of a malicious prompt into a GenAI-enabled application to achieve some goal, ranging from information disclosure and code execution rights.A list of AI security threats from OWASP. (Source: Ben Lorica)Mitigating these sorts of attacks is difficult because of the nature of GenAI applications. Instead of inspecting code for malicious entities, organizations must investigate the entirety of a model, including all of its weights. Thats not feasible in most situations, forcing them to adopt other techniques, says data scientist Ben Lorica.A poisoned checkpoint or a hallucinated/compromised Python package named in an LLMgenerated requirements file can give an attacker codeexecution rights inside your pipeline, Lorica writes in a recent installment of hisGradient Flownewsletter. Standard security scanners cant parse multigigabyte weight files, so additional safeguards are essential: digitally sign model weights, maintain a bill of materials for training data, and keep verifiable training logs.A twist on the prompt injection attack was recently described by researchers at HiddenLayer, who call their technique policy puppetry.By reformulating prompts to look like one of a few types of policy files, such as XML, INI, or JSON, an LLM can be tricked into subverting alignments or instructions, the researchers writein a summary of their findings. As a result, attackers can easily bypass system prompts and any safety alignments trained into the models.The company says its approach to spoofing policy prompts enables it to bypass model alignment and produce outputs that are in clear violation of AI safety policies, including CBRN (Chemical, Biological, Radiological, and Nuclear), mass violence, self-harm and system prompt leakage.Data PoisoningData lies at the heart of machine learning and AI models. So if a malicious user can inject, delete, or change the data that an organization uses to train an ML or AI model, then he or she can potentially skew the learning process and force the ML or AI model to generate an adverse result.Symptoms and remediations of data poisoning. (Source: CrowdStrike)A form of adversarial AI attacks, data poisoning or data manipulation, poses a serious risk to organizations that rely on AI. According to the security firm CrowdStrike, data poisoning is a risk to healthcare, finance, automotive, and HR use cases, and can even potentially be used to create backdoors.Because most AI models are constantly evolving, it can be difficult to detect when the dataset has been compromised, the company says ina 2024 blog post. Adversaries often make subtlebutpotent changes to the data that can go undetected. This is especially true if the adversary is an insider and therefore has in-depth information about the organizations security measures and tools as well as their processes.Data poisoning can be either targeted or non-targeted. In either case, there are telltale signs that security professionals can look for that indicate whether their data has been compromised.AI Attacks as Social EngineeringThese three AI attack vectorsslopsquatting, prompt injection, and data poisoningarent the only ways that cybercriminals can attack organizations via AI. But they are three avenues that AI-using organizations should be aware of to thwart the potential compromise of their systems.Unless organizations take pains to adapt to the new ways that hackers can compromise systems through AI, they run the risk of becoming a victim. Because LLMs behave probabilistically instead of deterministically, they are much more liable to social engineering types of attacks than traditional systems, Lorica says.The result is a dangerous security asymmetry: exploit techniques spread rapidly through open-source repositories and Discord channels, while effective mitigations demand architectural overhauls, sophisticated testing protocols, and comprehensive staff retraining, Lorica writes. The longer we treat LLMs as just another API, the wider that gap becomes.This article first appeared on BigDATAwire.
    0 Comentários 0 Compartilhamentos 45 Visualizações
  • WWW.AIWIRE.NET
    QED-C Workshop Identifies Quantum AI Targets
    How, exactly, will quantum computing and AI work together? Following the flood of marketing enthusiasm for Quantum AI during the past couple of years, the Quantum Economic Development Consortium (QED-C) has released a report (Quantum Computing and Artificial Intelligence Use Cases) based a a QED-C workshop topic held last October.While not especially granular, theQED-Creport provides a good overview for understanding Quantum AI, or how the two disciplines may work together. For example, theres a section on Quantum AI applications in chemistry and material science modeling; optimization in logistics and energy; weather modeling and environmental science; and signal processing and quantum sensing.Broadly, the report was prepared by members of the Quantum + AI use case technical committee: Carl Dukatz, Accenture; Pau Farr, D-Wave; Kevin Glynn, Northwestern University; Kenny Heitritter, qBraid; Tom Lubinski, Quantum Circuits Inc.; Rima Oueid, Department of Energy; Travis Scholten, IBM; Allison Schwartz, D-Wave; Keeper Sharkey, ODE, L3C; and Davide Venturelli, USRAThe report focuses on four topics:Novel solutions or applications that could emerge from the synergy of QC and AI that are currently not feasible with classical computing approaches.Approaches for which AI could be used to identify use cases for QC.Opportunities to use AI technologies to accelerate the development of specific QC technologies or the quantum ecosystem at large.The technical advances needed for QC + AI integration in possible areas of their joint application.Heres an excerpt: Though independent technologies, QC and AI can complement each other in many significant and multidirectional ways. For example, AI could assist QC by accelerating the development of circuit design, applications, and error correction and generating test data for algorithm development. QC can solve certain types of problems more efficiently, such as optimization and probabilistic tasks, potentially enhancing the ability of AI models to analyze complex patterns or perform computations that are infeasible for classical systems. A hybrid approach integrating the strengths of classical AI methods with the potential of QC algorithms leverages the two technologies to substantially reduce algorithmic complexity, improving the efficiency of computational processes and resource allocation.QED-C Executive Director Celia Merzbacher said, Simultaneous advances in quantum computing and AI offer advantages for both fields, individually and collectively. QED-C looked at the potential in the context of practical applications and use cases. At this early stage, industry academia and governments must collaborate to make the most of this opportunity.The QED-Cs report makes three overarching recommendations:Include support for QC + AI in federal quantum and AI initiatives;Increase QC + AI research and education in academia;Connect industries to accelerate QC + AI technology development, demonstration, and adoption.Translating these calls to action may be difficult in the current environment of science budget slashing. (The full text of the recommendations is included at the end of the article.)One of the more active areas highlighted by the report is using AI to accelerate the development of quantum technology itself. Indeed theres widespread consensus that this use case represents low-hanging fruit. The report singles out the following areas in which AI could help accelerate quantum development:AI could assist QC software and algorithm developers and optimize QC hardware design, including qubits and quantum circuits. Some microchip designers already use AI to develop advanced semiconductors, suggesting a natural extension of AIs role into QC hardware. AI could also help enhance the design of hardware components for quantum networks.AI could QC help design and refine QC algorithms to improve their efficiency and performance.Software developers can leverage code assistants trained on QC software development kits to both accelerate code development and increase the number of developers capable of programming such computers.AI could address critical QC challenges, such as error correction (e.g., by dynamical optimization of error correction codes based on real-time noise profiles) and noise reduction (e.g., by analysis of patterns of noise).Overall, the report could have been strengthened with discussion of a few specific cases histories given that QED-Cs membership is likely to have such practical experience, if only in POC efforts. Its likely the desire was not to spotlight any particular companys efforts. The full report was first released just to QED-C members in March but this week it was made available to the broader public.Link to report: https://quantumconsortium.org/quantum-computing-and-artificial-intelligence-use-cases.QEDC Quantum + AI Report Recommendations1.Include support for QC + AI in federal quantum and AI initiatives:The federal government invests in a substantial and broad portfolio of quantum technology R&D, guided by the National Quantum Initiative (NQI) Act, CHIPS and Science Act, and other legislation. Federal agencies should explicitly include support for R&D for QC + AI hybrid technologies, including for heterogeneous computing environments that comprise multiple computing paradigms, such as quantum processing units, central processing units, graphical processing units, neuromorphic computing et al.Federal support for QC + AI R&D should also foster infrastructure and programs that bring experts together to share knowledge and learning. For example, heterogeneous computing testbeds at national labs that are open to the broad research community could support cross-sector applied research aimed at practical application. In fact, the NQI established several national quantum centers, many of which include testbeds, and these should be expanded to explore QC + AI technologies. Specific support is needed for testbeds that facilitate integration of QC with other technologies.Non-quantum testbeds could also be encouraged to explore potential integration of QC technologies. For example, federally funded testbeds for grid resilience and advanced manufacturing could explore how QC + AI could benefit those fields. The NSFs National AI Research Institutes could include a focus on using AI to develop new QC algorithms, which could in turn advance both QC and AI. Cross-sector collaboration and integration of different technologies are critical for staying at the forefront of QC R&D and increasing opportunities for QC + AI technology deployment.Finally, the Quantum User Expansion for Science and Technology (QUEST) program authorized by the CHIPS and Science Act provides researchers from academia and the private sector access to commercial quantum computers. QUEST could include support for research specifically on QC + AI.2. Increase QC + AI research and education in academia:AI is currently a trendy field, attracting many community college and university students to software and computer science degrees. At the same time, QC is attracting interest among physical science and engineering students. This large pool can be leveraged to advance QC + AI technologies. For example, higher education institutions can introduce more students to both fields by offering interdisciplinary courses involving physics, math, engineering and computer science. To better prepare students for careers in industry and to build AI capacity at QC companies, universities could partner with QC companies to provide internships and hands-on training. Such a program exists in Canada1and would be a worthwhile addition to US efforts.Government funding agencies such as NSF, DOE, and DARPA could also encourage multidisciplinary QC + AI research by creating programs that fund teams of QC and AI researchers to collaborate. For example, multidisciplinary teams could research classical algorithms to drive efficiencies in real-world quantum use cases or large-scale methods for error correction. The Materials Genome Project that funded experimental, theoretical, and computation research by multidisciplinary teams is an example of such an approach. Agencies might need to create mechanisms to bridge program offices to ensure multidisciplinary program funding and management.3. Connect industries to accelerate QC + AI technology development, demonstration, and adoption:While AI is being adopted by seemingly every industry, QC + AI is still relatively early-stage, and awareness among end users is low. Better engagement and interaction among the developers of QC and AI and with end users is needed to enable creation of new capabilities, products, and services that provide real business value. QC and AI industry consortia, such as QED-C and the AI Alliance, should join forces to raise awareness among their members, create opportunities for collaboration, and identify gaps that government funding could help to fill. Together these groups can also engage end user communities to identify sectors that could be early adopters and partners to drive initial applications.Early applications will feed into additional and broader use cases, eventually reaching an inflection similar to that experienced by AI, after which QC + AI uses will grow exponentially. Hackathons and business-focused QC + AI challenges could push knowledge sharing and spur interest.Within government, there are opportunities to promote QC + AI development to achieve the goals of programs aimed at industries from advanced manufacturing to microelectronics. For example, Manufacturing USA funds 18 advanced manufacturing institutes that aim to develop diverse manufacturing capabilities. QC + AI has the potential to disrupt and allow for manufacturing innovation and could be infused into many of the institutes R&D programs. Similarly, the CHIPS R&D program seeks to develop capabilities for future chip technologies. In the 510-year timeframe, QC + AI will be poised to impact the traditional semiconductor-basedcomputing ecosystem. The CHIPS R&D program needs to include QC + AI research to ensure this emerging technology is seamlessly incorporated into future microelectronics technologies.This article first appeared on HPCwire.
    0 Comentários 0 Compartilhamentos 45 Visualizações
  • WWW.AIWIRE.NET
    IBM Think 2025: The Mainstreaming of Gen AI and Start of Agentic AI
    IBM Monday kicked off its annual Think conference being held in Boston this week. No surprise, generative AI and tools for enabling agentic AI will dominate discussion at the event expected to attract 5000 attendees. IBM, like most everyone, has high expectations.Over the next few years, we expect there will be over a billion new applications constructed using generative AI as they make their way in, said Arvind Krishna, IBM CEO, at a mid-day virtual media/analyst meeting. He was joined by Rob Thomas, chief commercial officer, and Ritika Gunnar, GM for data and AI.Arvind Krishna, IBM CEOCiting an IBM CEO Study, Krishna said our clients are expecting to double or even increase the investments on AI beyond that, however, they are finding that only about 25% of the time are they getting the ROI that they expect, driven by a lot of factors, access to enterprise data, the siloed nature of the different applications, together with the fragmentation that is happening in the infrastructure.The relative high failure rate and poor ROI of AI projects has become a growing problem for AI advocates. At Think, IBM is touting its watsonx enterprise AI platform and rolling out new features, many designed to tame the AIs deployment, execution, and ROI issues. Krishna singled out three:watsonx Orchestrate A tool, Krishna, said lets you build your own agent for enterprise use in five minutes or less. It has ~80 partner agents out of the box integrated with IBM-built agents for a total of 150.webMethods Hybrid Integration A new solution IBM says that replaces rigid workflows with intelligent and agent-driven automation. It will help users manage the sprawl of integrations across apps, APIs, B2B partners, events, gateways, and file transfers in hybrid cloud environments.watsonx.data This improved tool allows you bring together an open data lakehouse with data fabric capabilities like data lineage tracking and governance to help clients unify, govern, and activate data across silos, formats, and clouds, says IBM. [It] allows enterprises to connect their AI apps and agents with their unstructured data, which can lead to 40% more accurate AI than conventional RAG.(For a full rundown on the tools being highlighted at Think read the IBM press releasehere.)The fourth major announcement from Krishna was introduction of a new mainframe: IBM LinuxONE 5, which IBM says is the most secure and performant Linux platform to date with the ability to process up to 450 billion AI inference operations per day.LinuxONE 5 features highlighted by IBM include:State-of-the-art IBM AI accelerators, including IBMs Telum II on-chip AI processor and the IBM Spyre Accelerator card (available 4Q 2025 via PCIe card), to enable generative and high-volume AI applications such as transactional workloads.Advanced security offeringswith confidential containers to help clients protect their data and new integrations with IBMs pioneering quantum-safe encryption technology to address quantum-enabled cybersecurity attacks.Significant reductions in costs and power consumption moving cloud-native, containerized workloads from a compared x86 solution to an IBM LinuxONE 5 running the same software products can save up to 44% on the total cost of ownership over 5 years.IBM clearly has a broad portfolio of enterprise-centric AI offerings. Its focus on the hybrid cloud and AI together, said Krishna is key to taming what he called the infrastructure fragmentation going on. In Q&A, he, Thomas and Gunnar fielded a variety of questions;Has IBM seen any slowdown in AI spending?Krishna said, The short answer is, No, we are actually seeing people double down on their AI investments. As people are looking for productivity, theyre looking for cost savings, but theyre also looking to scale the revenue of their own companies. AI is one of the unique technologies that can hit at the intersection of all three. The CIO surveys done by our IBV group show that everybody is doubling down on AI investments, but theyre now looking for that return on AI. So the only change over the last 12 months is that people are stopping experimentation and focusing very much on where is the value to the business.Thomas chipped in. On ROI, Id say this is the year that value creation has become the focus. So its not a disillusionment with AI, as much as it is clients demanding value creation. I think we got through experimentation things like RAG, and to be blunt, some of the results were uncertain or mixed. I think the minute that it turned to automation, being able to automate your core infrastructure with things like TerraForm or vault as an example, automating your financials, moving into assistance and then agents. Thats where we saw, Id say, the tipping point for value creation.Another questioner noted there are often more exceptions than rules in a business process. How does IBM you create the level of dynamic reasoning required for AI agents to manage a workflow end to end?Krishna provided an example from IBM.We have a lot of experience in this. Ill give it a simple example, and then well broaden it from there. We looked at our HR processes, and we began two years ago with the top five most common queries that were being done. We automated those [and] every couple of weeks after we kept adding until at this point, we have over 80 of the common action. So those are workflows that have been fully automated.There are actually about 6% of what we do that we dont think theres an ROI in automating at this point. As the technology becomes better over time, it may be possible that the cost to do that comes down, and that is the approach we sort of take across the board. In many, many things that are being done, the human is never going to be out of the loop. The human is always going to be there for the most either complex or the edge cases where the AI does not have a confidence in its own answer, he said.Another question posed was, While there is a lot of enthusiasm, companies are facing challenges when it comes to adoption. How many clients have implemented so far? How many POCs, and POCs that are translating to full time projects?Thats a good milestone question.Krishnasaid, [To] give a little bit of context on this question, I think the precursor to agents was assistance. If you look at assistance, I think at last count, we had over 20,000 deployments. Sometimes there were three or four at one client, but over 20,000 deployments. So I think its actually going back to comment Rob made earlier. While there are some difficulties, there is also a lot of eagerness, excitement and enthusiasm about adopting these technologies. So 20,000 was our number. If I look at proof of concepts, I think out teams do over 4000 or 5000 each year, I think in terms of kicking the tires. And about half of those, by the way, do make their way into some form of production. I wouldnt say all of them, but about half of them do make their way forward.I also think that we should understand that everybody our partners, from SAP with Joule technology, Salesforce with their Agent Force technologies, ServiceNow with their agents on workflows, Adobe, with their agents on on experience, Work day with their agents on how to make the HR process more effective. Oracle, with some of the agents around both HR as well as payroll, can go on. Those are being extremely well adopted in all of these cases, including our own across, across many, many clients, because they help. Now, these are not general purpose agents. Almost all the examples I mentioned, they do a small but very precise set of tasks extremely well.Gunnaradded, We have anIBVstudy that we just released today, or will be released tomorrow, that the majority of enterprises are doing proof of concepts or are leveraging AI agent technology, whether it be from their single application vendors, like in the examples that Arvind mentioned, or whether it be agents that they build on their own. So we know that a majority of organizations, 2025 is the year that we see a lot of organizations experimenting with AI agents. Now we also see some of those in production. And when we see those in production, theyre usually simple, task based execution AI agents.Were seeing a lot of now research and technology maturing in terms of the underlying reasoning capabilities of these models and in terms of what it means to have a full stack observability and traceability. Were now starting to see, even in the AI agent technologies, more complex use cases such as automations, collaboration, orchestration. And so you can see that as a spectrum where many of them are experimenting simple task based AI agents are being implemented in production. And as the technology continues to mature, were going to see even the more complex cases of orchestration, collaboration and automation continue to mature into production, whereby in the next couple years, we believe that over 50% of organizations will have these AI agents embedded in their essential systems.Thomasadded, To give you a specific example, about a year and a half ago we partnered with Dunn and Bradstreet to build an assistant called Ask procurement. This was using procurement data and company data and shipping data from Dun and Bradstreet. We constructed this application which leverages orchestrate and Watson x and is able gives a procurement agent the ability to ask questions and to figure out the best place to source something; weve now started working on, how do we make that agentic? Because today its really just an AI, Q and A, and making it agentic would be actually linking it to if Im choosing a supplier, what can I get in terms of actual delivery times, as I look at shipping schedules, shipping lanes, whatever may be. So I would say the technology is there. This is about the change management and then applying it to these specific problems. But for most companies, I think often starting with an assistant and then moving towards agentic workflows is sometimes the fastest, lowest risk way to do it.Link to IBM press announcement,https://www.hpcwire.com/off-the-wire/ibm-accelerates-enterprise-gen-ai-revolution-with-hybrid-capabilities.Link to THINK,https://www.ibm.com/events/think.This article first appeared on HPCwire.
    0 Comentários 0 Compartilhamentos 45 Visualizações
  • WWW.AIWIRE.NET
    SAS Rolls Out AI Agents, Digital Twins, and Industry Models at Innovate 2025
    At its SAS Innovate 2025 conference in Orlando, SAS rolled out a series of updates aimed at helping businesses make better decisions with AI. The announcements touched on a broad set of topics, from digital twins and quantum computing to new AI models and updates to its Viya AI and analytics platform. Taken together, the message was clear: SAS is focused on building AI that is usable, governed, and designed for practical outcomes.With decades of experience in enterprise analytics, SAS is now applying that expertise to the data and decisioning layer, or how information moves through an organization and how AI can support clear outcomes. The conferences announcements show how that approach is reflected in its products, partnerships, and latest research.Heres a look at a few of the first days announcements:AI Agents With Built-In GuardrailsSAS is developing AI agents, or tools that can make decisions or carry out tasks based on data. These agents are designed to follow clear rules and stay within defined boundaries, rather than acting independently without limits. The goal is to help organizations use automation in a controlled and predictable way.SAS Innovate 2025 kicked off in Orlando this week.The agent framework combines traditional rule-based analytics with newer machine learning techniques, including large language models. This hybrid approach allows users to define specific conditions for how agents behave, including when to escalate a decision to a human. Organizations can tune this balance based on the risk, complexity, or regulatory context of each task.Governance is built in, as the agents log decisions, apply access controls, and support audit and compliance requirements. The aim is to make AI decisioning more transparent and easier to monitor, especially in fields like finance, healthcare, and public sector work, where explainability and accountability are essential.Digital Twins That Run on Gaming TechSAS is working with Epic Games to bring the gaming companys 3D simulation technology into digital twin tools. By using Unreal Engine, a 3D engine developed by Epic Games originally for video games, SAS can create detailed virtual models of manufacturing environments, or digital twins. Companies like Georgia-Pacific are testing this setup to simulate things like equipment movement and planning, as well as factory layout changes and safety measures, before making changes in the real setting.The digital twins combine real-time data from sensors and IoT devices with SAS analytics and interactive visuals. This allows engineers and operators to run simulations that reflect actual conditions, helping them troubleshoot issues or improve efficiency without interrupting production.While the current focus is on manufacturing, SAS says the same approach could apply to hospital operations, city planning, and other fields where physical systems are complex and hard to model.Prebuilt AI Models for Specific Use CasesSAS has released a set of prebuilt AI models aimed at solving common problems in sectors like health care, manufacturing, and government. The offerings include models for document analysis, identifying duplicate records across systems, and optimizing supply chains.Each model comes with documentation and can be used as-is or adapted using an organizations own data. They are designed to integrate into existing systems with minimal setup, helping teams get to deployment faster, without starting from zeroSAS is also building toward more automated systems that pair these models with AI agents. In one example, an agent can automatically prepare data for modeling by creating and managing data lakes, a task that typically requires significant manual effort. This kind of pairing is part of SASs broader push to simplify the AI pipeline, from data ingestion to model deployment.These announcements show where SAS is putting its attention: building practical tools that help organizations use AI more effectively and with more control. While the focus has been on practical tools and responsible AI, the day also offered some lighter moments. Mondays memorable moments included appearances from Frank Abagnale, Jr. (the real-life figure behind Catch Me If You Can) as well as a full-on 90s nostalgia hour featuring Fresh Prince alumni Alfonso Ribeiro and DJ Jazzy Jeff.If day one is any indication, SAS Innovate is going to keep blending practical AI updates with a few curveballs along the way. AIwire will be covering more from the show, including SASs work on quantum AI, a new tool for generating synthetic data, and updates to the SAS Viya platform. Weve also had the chance to speak with several SAS executives, and well be sharing those insights. Stay tuned for more.
    0 Comentários 0 Compartilhamentos 45 Visualizações
  • WWW.AIWIRE.NET
    IBM Think 2025: Download a Sneak Peek of the Next Gen Granite Models
    AtIBM Think 2025, IBM announced Granite 4.0 Tiny Preview, a preliminary version of the smallest model in the upcoming Granite 4.0 family of language models, to the open source community. IBM Granite models are a series of AI foundation models. Initially intended for use in IBMs cloud-based data and generative AI platform Watsonx, along with other models, IBM opened the source code of some code models. IBM Granite models are trained on datasets curated from the Internet, academic publications, code datasets, and legal and finance documents. The following is based on theIBM Think news announcement.At FP8 precision, the Granite 4.0 Tiny Preview is extremely compact and compute-efficient. It allows several concurrent sessions to perform long context (128K) tasks that can be run on consumer-grade hardware, including GPUs.Though the model is only partially trained, it has only seen 2.5T of a planned 15T or more training tokens, it already offers performance rivaling that of IBM Granite 3.3 2B Instruct despite fewer active parameters and a roughly 72% reduction in memory requirements. IBM anticipates Granite 4.0 Tinys performance to be on par with Granite 3.3 8B Instruct by the time it has completed training and post-training.Granite-4.0 Tiny performance compared to Granite-3.3 2B Instruct. Click to enlarge. (Source: IBM)As its name suggests, Granite 4.0 Tiny will be among the smallest offerings in the Granite 4.0 model family. It will be officially released this summer as part of a model lineupthat also includesGranite 4.0 Small and Granite 4.0 Medium. Granite 4.0 continues IBMs commitment to making efficiency and practicality the cornerstone of its enterprise LLM development.This preliminary version of Granite 4.0 Tiny is now available on Hugging Face under a standard Apache 2.0 license. IBM intends to allow GPU-poor developers to experiment and tinker with the model on consumer-grade GPUs. The models novel architecture is pending support in Hugging Face transformers andvLLM, which IBM anticipates will be completed shortly for both projects. Official support to run this model locally through platform partners, including Ollama and LMStudio, is expected in time for thefullmodel release later this summer.Enterprise Performance on Consumer HardwareIBM also mentions that LLM memory requirements are often provided, literally and figuratively, without proper context. Its not enough to know that a model can be successfully loaded into your GPU(s): you need to know that your hardware can handle the model at the context lengths that your use case requires.Furthermore, many enterprise use cases entail multiple model deployment, but batch inferencing ofmultipleconcurrent instances. Therefore, IBM endeavors to measure and report memory requirements with long context and concurrent sessions in mind.In that respect, IBM believes Granite 4.0 Tiny is one of todays most memory-efficient language models. Despiteverylong contexts with several concurrent instances of Granite 4.0, Tiny can easily run on a modest consumer GPU.Granite-4.0 Tiny memory requirements vs other popular models. Click to enlarge. (Source: IBM)An All-new Hybrid MoE ArchitectureWhereas prior generations of Granite LLMs utilized a conventional transformer architecture, all models in the Granite 4.0 family utilize a new hybrid Mamba-2/Transformer architecture, marrying the speed and efficiency of Mamba with the precision of transformer-based self-attention. Granite 4.0 Tiny-Preview is a fine-grained hybridmixture of experts (MoE)model,with 7B total parameters and only 1B active parameters at inference time.Many innovations informing the Granite 4 architecture arose fromIBM Researchs collaboration with the original Mamba creators onBamba, an experimental open-source hybrid model whose successor (Bamba v2) was released earlier this week.A Brief History of Mamba ModelsMambais a type ofstate space model(SSM) introduced in 2023, about six years after the debut oftransformersin 2017.SSMs are conceptually similar to therecurrent neural networks (RNNs)that dominated natural language processing (NLP) in the pre-transformer era. They wereoriginallydesigned to predict the nextstateof a continuous sequence (like an electrical signal) using only information from the current state, previous state, and range of possibilities (thestate space). Though theyve been used across several domains for decades, SSMs share certain shortcomings with RNNs that, until recently, limited their potential for language modeling.Unlike theself-attention mechanismof transformers, conventional SSMs have no inherent abilityto selectively focus on or ignore specific pieces of contextual information. So in 2023, Carnegie Mellons Albert Gu and Princetons Tri Dao introduced a type ofstructured state space sequence (S4) neural networkthat adds aselectionmechanism and ascanmethod (for computational efficiency)abbreviated as an S6 modeland achieved language modeling results competitive with transformers. They nicknamed their model Mamba because, among other reasons,all of those Ss sound like a snakes hiss.In 2024, Gu and Dao releasedMamba-2, a simplified and optimized implementation of the Mamba architecture. Equally importantly,their technical paperfleshed out the compatibility between SSMs and self-attention.Mamba-2 vs. TransformersMambas major advantages over transformer-based models center on efficiency and speed.Transformers have a crucial weakness: the computational requirements of self-attention scale quadratically with context. In other words, each time yourcontext lengthdoubles, theattention mechanismdoesnt just use double the resources, it usesquadruplethe resources. This quadratic bottleneck increasingly throttles speed and performance as the context window (and correspondingKV-cache) grows.Conversely, Mambas computational needs scalelinearly: if you double the length of an input sequence, Mamba uses only double the resources. Whereasself-attentionmust repeatedly compute the relevance of every previous token to each new token, Mamba simply maintains a condensed, fixed-size summary of prior context from prior tokens. As the model reads each new token, it determines that tokens relevance, then updates (or doesnt update) the summary accordingly. Essentially, whereas self-attention retainseverybit of information and then weights the influence of each based on its relevance, Mambaselectivelyretains only the relevant information.While Transformers are more memory-intensive and computationally redundant, the method has itsownadvantages. For instance,research has shownthat transformers still outpace Mamba and Mamba-2 on tasks requiring in-context learning (such asfew-shot prompting),copying, or long-context reasoning.The Best of Both WorldsFortunately, the respective strengths of transformers and Mamba are not mutually exclusive. In the original Mamba-2 paper, authors Dao and Gu suggest that a hybrid model could exceed the performance of a pure transformer or SSMa notion validated byNvidia research from last year. To explore this further, IBM Research collaborated with Dao and Guthemselves, along withthe University of Illinois at Urbana-Champaign (UIUC) s Minjia Zhang, onBambaandBamba V2. Bamba, in turn, informed many of the architectural elements of Granite 4.0.The Granite 4.0 MoE architecture employs 9 Mamba blocks for every one transformer block. In essence, the selectivity mechanisms of the Mamba blocks efficiently capture global context, which is then passed to transformer blocks that enable a more nuanced parsing of local context. The result is a dramatic reduction in memory usage and latency with no apparent tradeoff in performance.Granite 4.0 Tiny doubles down on these efficiency gains by implementing them within acompact, fine-grained mixture of experts (MoE) framework, comprising 7B total parameters and 64 experts, yielding 1B active parameters at inference time. Further details are available inGranite 4.0 Tiny Previews Hugging Face model card.Unconstrained Context LengthOne of the more tantalizing aspects of SSM-based language models is their theoretical ability to handle infinitely long sequences. However, due to practical constraints, the word theoretical typically does a lot of heavy lifting.One of those constraints, especially for hybrid-SSM models, comes from the positional encoding (PE) used to represent information about the order of words. PE adds computational steps, and research has shown that models using PE techniques such asrotary positional encoding (RoPE)struggle to generalize to sequences longer than theyve seen in training.The Granite 4.0 architecture usesno positional encoding(NoPE). IBM testing convincingly demonstrates that this has had no adverse effect on long-context performance.At present, IBM has already validated Tiny Previews long-context performance for at least 128K tokens and expects to validate similar performance on significantly longer context lengths by the time the model has completed training and post-training. Its worth noting that a key challenge in definitively validating performance on tasks in the neighborhood of 1 M-token context is the scarcity of suitable datasets.The other practical constraint on Mamba context length is compute. Linear scaling is better than quadratic scaling, but still adds up eventually. Here again, Granite 4.0 Tiny has two key advantages:Unlike PE, NoPE doesnt add additional computational burden to the attention mechanism in the models transformer layers.Granite 4.0 Tiny isextremelycompact and efficient, leaving plenty of hardware space for linear scaling.Put simply, the Granite 4.0 MoE architectureitself does not constrain context length. It can go as far as your hardware resources allow.Whats Happening NextIBM expressed its excitement about continuing pre-training Granite 4.0 Tiny, given such promising results so early in the process. It is also excited to apply its lessons from post-training Granite 3.3, particularly regarding reasoning capabilities and complex instruction following, to the new models.More information about new developments in the Granite Series was presented atIBM Think 2025and in the following weeks and months.You can find the Granite 4.0 Tiny onHugging Face.This article is based on IBM Think News Announcementauthored by Kate Soule, Director, Technical Product Management, Granite, andDave Bergmann Senior Writer, AI Models at IBM.
    0 Comentários 0 Compartilhamentos 45 Visualizações
Mais Stories