0 Commenti
0 condivisioni
5 Views
Elenco
Scopri nuove persone e i loro amici a quattro zampe, e fai nuove amicizie
-
Effettua l'accesso per mettere mi piace, condividere e commentare!
-
WWW.GOCOMICS.COMAnimal Crackers by Mike Osbun for Sat, 12 Apr 2025Source -Patreon0 Commenti 0 condivisioni 5 Views
-
WWW.GOCOMICS.COMAnimal Crackers by Mike Osbun for Fri, 11 Apr 2025Source -Patreon0 Commenti 0 condivisioni 6 Views
-
WWW.AIWIRE.NETIBM Think 2025: Download a Sneak Peek of the Next Gen Granite ModelsAtIBM Think 2025, IBM announced Granite 4.0 Tiny Preview, a preliminary version of the smallest model in the upcoming Granite 4.0 family of language models, to the open source community. IBM Granite models are a series of AI foundation models. Initially intended for use in IBMs cloud-based data and generative AI platform Watsonx, along with other models, IBM opened the source code of some code models. IBM Granite models are trained on datasets curated from the Internet, academic publications, code datasets, and legal and finance documents. The following is based on theIBM Think news announcement.At FP8 precision, the Granite 4.0 Tiny Preview is extremely compact and compute-efficient. It allows several concurrent sessions to perform long context (128K) tasks that can be run on consumer-grade hardware, including GPUs.Though the model is only partially trained, it has only seen 2.5T of a planned 15T or more training tokens, it already offers performance rivaling that of IBM Granite 3.3 2B Instruct despite fewer active parameters and a roughly 72% reduction in memory requirements. IBM anticipates Granite 4.0 Tinys performance to be on par with Granite 3.3 8B Instruct by the time it has completed training and post-training.Granite-4.0 Tiny performance compared to Granite-3.3 2B Instruct. Click to enlarge. (Source: IBM)As its name suggests, Granite 4.0 Tiny will be among the smallest offerings in the Granite 4.0 model family. It will be officially released this summer as part of a model lineupthat also includesGranite 4.0 Small and Granite 4.0 Medium. Granite 4.0 continues IBMs commitment to making efficiency and practicality the cornerstone of its enterprise LLM development.This preliminary version of Granite 4.0 Tiny is now available on Hugging Face under a standard Apache 2.0 license. IBM intends to allow GPU-poor developers to experiment and tinker with the model on consumer-grade GPUs. The models novel architecture is pending support in Hugging Face transformers andvLLM, which IBM anticipates will be completed shortly for both projects. Official support to run this model locally through platform partners, including Ollama and LMStudio, is expected in time for thefullmodel release later this summer.Enterprise Performance on Consumer HardwareIBM also mentions that LLM memory requirements are often provided, literally and figuratively, without proper context. Its not enough to know that a model can be successfully loaded into your GPU(s): you need to know that your hardware can handle the model at the context lengths that your use case requires.Furthermore, many enterprise use cases entail multiple model deployment, but batch inferencing ofmultipleconcurrent instances. Therefore, IBM endeavors to measure and report memory requirements with long context and concurrent sessions in mind.In that respect, IBM believes Granite 4.0 Tiny is one of todays most memory-efficient language models. Despiteverylong contexts with several concurrent instances of Granite 4.0, Tiny can easily run on a modest consumer GPU.Granite-4.0 Tiny memory requirements vs other popular models. Click to enlarge. (Source: IBM)An All-new Hybrid MoE ArchitectureWhereas prior generations of Granite LLMs utilized a conventional transformer architecture, all models in the Granite 4.0 family utilize a new hybrid Mamba-2/Transformer architecture, marrying the speed and efficiency of Mamba with the precision of transformer-based self-attention. Granite 4.0 Tiny-Preview is a fine-grained hybridmixture of experts (MoE)model,with 7B total parameters and only 1B active parameters at inference time.Many innovations informing the Granite 4 architecture arose fromIBM Researchs collaboration with the original Mamba creators onBamba, an experimental open-source hybrid model whose successor (Bamba v2) was released earlier this week.A Brief History of Mamba ModelsMambais a type ofstate space model(SSM) introduced in 2023, about six years after the debut oftransformersin 2017.SSMs are conceptually similar to therecurrent neural networks (RNNs)that dominated natural language processing (NLP) in the pre-transformer era. They wereoriginallydesigned to predict the nextstateof a continuous sequence (like an electrical signal) using only information from the current state, previous state, and range of possibilities (thestate space). Though theyve been used across several domains for decades, SSMs share certain shortcomings with RNNs that, until recently, limited their potential for language modeling.Unlike theself-attention mechanismof transformers, conventional SSMs have no inherent abilityto selectively focus on or ignore specific pieces of contextual information. So in 2023, Carnegie Mellons Albert Gu and Princetons Tri Dao introduced a type ofstructured state space sequence (S4) neural networkthat adds aselectionmechanism and ascanmethod (for computational efficiency)abbreviated as an S6 modeland achieved language modeling results competitive with transformers. They nicknamed their model Mamba because, among other reasons,all of those Ss sound like a snakes hiss.In 2024, Gu and Dao releasedMamba-2, a simplified and optimized implementation of the Mamba architecture. Equally importantly,their technical paperfleshed out the compatibility between SSMs and self-attention.Mamba-2 vs. TransformersMambas major advantages over transformer-based models center on efficiency and speed.Transformers have a crucial weakness: the computational requirements of self-attention scale quadratically with context. In other words, each time yourcontext lengthdoubles, theattention mechanismdoesnt just use double the resources, it usesquadruplethe resources. This quadratic bottleneck increasingly throttles speed and performance as the context window (and correspondingKV-cache) grows.Conversely, Mambas computational needs scalelinearly: if you double the length of an input sequence, Mamba uses only double the resources. Whereasself-attentionmust repeatedly compute the relevance of every previous token to each new token, Mamba simply maintains a condensed, fixed-size summary of prior context from prior tokens. As the model reads each new token, it determines that tokens relevance, then updates (or doesnt update) the summary accordingly. Essentially, whereas self-attention retainseverybit of information and then weights the influence of each based on its relevance, Mambaselectivelyretains only the relevant information.While Transformers are more memory-intensive and computationally redundant, the method has itsownadvantages. For instance,research has shownthat transformers still outpace Mamba and Mamba-2 on tasks requiring in-context learning (such asfew-shot prompting),copying, or long-context reasoning.The Best of Both WorldsFortunately, the respective strengths of transformers and Mamba are not mutually exclusive. In the original Mamba-2 paper, authors Dao and Gu suggest that a hybrid model could exceed the performance of a pure transformer or SSMa notion validated byNvidia research from last year. To explore this further, IBM Research collaborated with Dao and Guthemselves, along withthe University of Illinois at Urbana-Champaign (UIUC) s Minjia Zhang, onBambaandBamba V2. Bamba, in turn, informed many of the architectural elements of Granite 4.0.The Granite 4.0 MoE architecture employs 9 Mamba blocks for every one transformer block. In essence, the selectivity mechanisms of the Mamba blocks efficiently capture global context, which is then passed to transformer blocks that enable a more nuanced parsing of local context. The result is a dramatic reduction in memory usage and latency with no apparent tradeoff in performance.Granite 4.0 Tiny doubles down on these efficiency gains by implementing them within acompact, fine-grained mixture of experts (MoE) framework, comprising 7B total parameters and 64 experts, yielding 1B active parameters at inference time. Further details are available inGranite 4.0 Tiny Previews Hugging Face model card.Unconstrained Context LengthOne of the more tantalizing aspects of SSM-based language models is their theoretical ability to handle infinitely long sequences. However, due to practical constraints, the word theoretical typically does a lot of heavy lifting.One of those constraints, especially for hybrid-SSM models, comes from the positional encoding (PE) used to represent information about the order of words. PE adds computational steps, and research has shown that models using PE techniques such asrotary positional encoding (RoPE)struggle to generalize to sequences longer than theyve seen in training.The Granite 4.0 architecture usesno positional encoding(NoPE). IBM testing convincingly demonstrates that this has had no adverse effect on long-context performance.At present, IBM has already validated Tiny Previews long-context performance for at least 128K tokens and expects to validate similar performance on significantly longer context lengths by the time the model has completed training and post-training. Its worth noting that a key challenge in definitively validating performance on tasks in the neighborhood of 1 M-token context is the scarcity of suitable datasets.The other practical constraint on Mamba context length is compute. Linear scaling is better than quadratic scaling, but still adds up eventually. Here again, Granite 4.0 Tiny has two key advantages:Unlike PE, NoPE doesnt add additional computational burden to the attention mechanism in the models transformer layers.Granite 4.0 Tiny isextremelycompact and efficient, leaving plenty of hardware space for linear scaling.Put simply, the Granite 4.0 MoE architectureitself does not constrain context length. It can go as far as your hardware resources allow.Whats Happening NextIBM expressed its excitement about continuing pre-training Granite 4.0 Tiny, given such promising results so early in the process. It is also excited to apply its lessons from post-training Granite 3.3, particularly regarding reasoning capabilities and complex instruction following, to the new models.More information about new developments in the Granite Series was presented atIBM Think 2025and in the following weeks and months.You can find the Granite 4.0 Tiny onHugging Face.This article is based on IBM Think News Announcementauthored by Kate Soule, Director, Technical Product Management, Granite, andDave Bergmann Senior Writer, AI Models at IBM.0 Commenti 0 condivisioni 2 Views
-
WWW.AIWIRE.NETSAS Rolls Out AI Agents, Digital Twins, and Industry Models at Innovate 2025At its SAS Innovate 2025 conference in Orlando, SAS rolled out a series of updates aimed at helping businesses make better decisions with AI. The announcements touched on a broad set of topics, from digital twins and quantum computing to new AI models and updates to its Viya AI and analytics platform. Taken together, the message was clear: SAS is focused on building AI that is usable, governed, and designed for practical outcomes.With decades of experience in enterprise analytics, SAS is now applying that expertise to the data and decisioning layer, or how information moves through an organization and how AI can support clear outcomes. The conferences announcements show how that approach is reflected in its products, partnerships, and latest research.Heres a look at a few of the first days announcements:AI Agents With Built-In GuardrailsSAS is developing AI agents, or tools that can make decisions or carry out tasks based on data. These agents are designed to follow clear rules and stay within defined boundaries, rather than acting independently without limits. The goal is to help organizations use automation in a controlled and predictable way.SAS Innovate 2025 kicked off in Orlando this week.The agent framework combines traditional rule-based analytics with newer machine learning techniques, including large language models. This hybrid approach allows users to define specific conditions for how agents behave, including when to escalate a decision to a human. Organizations can tune this balance based on the risk, complexity, or regulatory context of each task.Governance is built in, as the agents log decisions, apply access controls, and support audit and compliance requirements. The aim is to make AI decisioning more transparent and easier to monitor, especially in fields like finance, healthcare, and public sector work, where explainability and accountability are essential.Digital Twins That Run on Gaming TechSAS is working with Epic Games to bring the gaming companys 3D simulation technology into digital twin tools. By using Unreal Engine, a 3D engine developed by Epic Games originally for video games, SAS can create detailed virtual models of manufacturing environments, or digital twins. Companies like Georgia-Pacific are testing this setup to simulate things like equipment movement and planning, as well as factory layout changes and safety measures, before making changes in the real setting.The digital twins combine real-time data from sensors and IoT devices with SAS analytics and interactive visuals. This allows engineers and operators to run simulations that reflect actual conditions, helping them troubleshoot issues or improve efficiency without interrupting production.While the current focus is on manufacturing, SAS says the same approach could apply to hospital operations, city planning, and other fields where physical systems are complex and hard to model.Prebuilt AI Models for Specific Use CasesSAS has released a set of prebuilt AI models aimed at solving common problems in sectors like health care, manufacturing, and government. The offerings include models for document analysis, identifying duplicate records across systems, and optimizing supply chains.Each model comes with documentation and can be used as-is or adapted using an organizations own data. They are designed to integrate into existing systems with minimal setup, helping teams get to deployment faster, without starting from zeroSAS is also building toward more automated systems that pair these models with AI agents. In one example, an agent can automatically prepare data for modeling by creating and managing data lakes, a task that typically requires significant manual effort. This kind of pairing is part of SASs broader push to simplify the AI pipeline, from data ingestion to model deployment.These announcements show where SAS is putting its attention: building practical tools that help organizations use AI more effectively and with more control. While the focus has been on practical tools and responsible AI, the day also offered some lighter moments. Mondays memorable moments included appearances from Frank Abagnale, Jr. (the real-life figure behind Catch Me If You Can) as well as a full-on 90s nostalgia hour featuring Fresh Prince alumni Alfonso Ribeiro and DJ Jazzy Jeff.If day one is any indication, SAS Innovate is going to keep blending practical AI updates with a few curveballs along the way. AIwire will be covering more from the show, including SASs work on quantum AI, a new tool for generating synthetic data, and updates to the SAS Viya platform. Weve also had the chance to speak with several SAS executives, and well be sharing those insights. Stay tuned for more.0 Commenti 0 condivisioni 2 Views
-
WWW.AIWIRE.NETIBM Think 2025: The Mainstreaming of Gen AI and Start of Agentic AIIBM Monday kicked off its annual Think conference being held in Boston this week. No surprise, generative AI and tools for enabling agentic AI will dominate discussion at the event expected to attract 5000 attendees. IBM, like most everyone, has high expectations.Over the next few years, we expect there will be over a billion new applications constructed using generative AI as they make their way in, said Arvind Krishna, IBM CEO, at a mid-day virtual media/analyst meeting. He was joined by Rob Thomas, chief commercial officer, and Ritika Gunnar, GM for data and AI.Arvind Krishna, IBM CEOCiting an IBM CEO Study, Krishna said our clients are expecting to double or even increase the investments on AI beyond that, however, they are finding that only about 25% of the time are they getting the ROI that they expect, driven by a lot of factors, access to enterprise data, the siloed nature of the different applications, together with the fragmentation that is happening in the infrastructure.The relative high failure rate and poor ROI of AI projects has become a growing problem for AI advocates. At Think, IBM is touting its watsonx enterprise AI platform and rolling out new features, many designed to tame the AIs deployment, execution, and ROI issues. Krishna singled out three:watsonx Orchestrate A tool, Krishna, said lets you build your own agent for enterprise use in five minutes or less. It has ~80 partner agents out of the box integrated with IBM-built agents for a total of 150.webMethods Hybrid Integration A new solution IBM says that replaces rigid workflows with intelligent and agent-driven automation. It will help users manage the sprawl of integrations across apps, APIs, B2B partners, events, gateways, and file transfers in hybrid cloud environments.watsonx.data This improved tool allows you bring together an open data lakehouse with data fabric capabilities like data lineage tracking and governance to help clients unify, govern, and activate data across silos, formats, and clouds, says IBM. [It] allows enterprises to connect their AI apps and agents with their unstructured data, which can lead to 40% more accurate AI than conventional RAG.(For a full rundown on the tools being highlighted at Think read the IBM press releasehere.)The fourth major announcement from Krishna was introduction of a new mainframe: IBM LinuxONE 5, which IBM says is the most secure and performant Linux platform to date with the ability to process up to 450 billion AI inference operations per day.LinuxONE 5 features highlighted by IBM include:State-of-the-art IBM AI accelerators, including IBMs Telum II on-chip AI processor and the IBM Spyre Accelerator card (available 4Q 2025 via PCIe card), to enable generative and high-volume AI applications such as transactional workloads.Advanced security offeringswith confidential containers to help clients protect their data and new integrations with IBMs pioneering quantum-safe encryption technology to address quantum-enabled cybersecurity attacks.Significant reductions in costs and power consumption moving cloud-native, containerized workloads from a compared x86 solution to an IBM LinuxONE 5 running the same software products can save up to 44% on the total cost of ownership over 5 years.IBM clearly has a broad portfolio of enterprise-centric AI offerings. Its focus on the hybrid cloud and AI together, said Krishna is key to taming what he called the infrastructure fragmentation going on. In Q&A, he, Thomas and Gunnar fielded a variety of questions;Has IBM seen any slowdown in AI spending?Krishna said, The short answer is, No, we are actually seeing people double down on their AI investments. As people are looking for productivity, theyre looking for cost savings, but theyre also looking to scale the revenue of their own companies. AI is one of the unique technologies that can hit at the intersection of all three. The CIO surveys done by our IBV group show that everybody is doubling down on AI investments, but theyre now looking for that return on AI. So the only change over the last 12 months is that people are stopping experimentation and focusing very much on where is the value to the business.Thomas chipped in. On ROI, Id say this is the year that value creation has become the focus. So its not a disillusionment with AI, as much as it is clients demanding value creation. I think we got through experimentation things like RAG, and to be blunt, some of the results were uncertain or mixed. I think the minute that it turned to automation, being able to automate your core infrastructure with things like TerraForm or vault as an example, automating your financials, moving into assistance and then agents. Thats where we saw, Id say, the tipping point for value creation.Another questioner noted there are often more exceptions than rules in a business process. How does IBM you create the level of dynamic reasoning required for AI agents to manage a workflow end to end?Krishna provided an example from IBM.We have a lot of experience in this. Ill give it a simple example, and then well broaden it from there. We looked at our HR processes, and we began two years ago with the top five most common queries that were being done. We automated those [and] every couple of weeks after we kept adding until at this point, we have over 80 of the common action. So those are workflows that have been fully automated.There are actually about 6% of what we do that we dont think theres an ROI in automating at this point. As the technology becomes better over time, it may be possible that the cost to do that comes down, and that is the approach we sort of take across the board. In many, many things that are being done, the human is never going to be out of the loop. The human is always going to be there for the most either complex or the edge cases where the AI does not have a confidence in its own answer, he said.Another question posed was, While there is a lot of enthusiasm, companies are facing challenges when it comes to adoption. How many clients have implemented so far? How many POCs, and POCs that are translating to full time projects?Thats a good milestone question.Krishnasaid, [To] give a little bit of context on this question, I think the precursor to agents was assistance. If you look at assistance, I think at last count, we had over 20,000 deployments. Sometimes there were three or four at one client, but over 20,000 deployments. So I think its actually going back to comment Rob made earlier. While there are some difficulties, there is also a lot of eagerness, excitement and enthusiasm about adopting these technologies. So 20,000 was our number. If I look at proof of concepts, I think out teams do over 4000 or 5000 each year, I think in terms of kicking the tires. And about half of those, by the way, do make their way into some form of production. I wouldnt say all of them, but about half of them do make their way forward.I also think that we should understand that everybody our partners, from SAP with Joule technology, Salesforce with their Agent Force technologies, ServiceNow with their agents on workflows, Adobe, with their agents on on experience, Work day with their agents on how to make the HR process more effective. Oracle, with some of the agents around both HR as well as payroll, can go on. Those are being extremely well adopted in all of these cases, including our own across, across many, many clients, because they help. Now, these are not general purpose agents. Almost all the examples I mentioned, they do a small but very precise set of tasks extremely well.Gunnaradded, We have anIBVstudy that we just released today, or will be released tomorrow, that the majority of enterprises are doing proof of concepts or are leveraging AI agent technology, whether it be from their single application vendors, like in the examples that Arvind mentioned, or whether it be agents that they build on their own. So we know that a majority of organizations, 2025 is the year that we see a lot of organizations experimenting with AI agents. Now we also see some of those in production. And when we see those in production, theyre usually simple, task based execution AI agents.Were seeing a lot of now research and technology maturing in terms of the underlying reasoning capabilities of these models and in terms of what it means to have a full stack observability and traceability. Were now starting to see, even in the AI agent technologies, more complex use cases such as automations, collaboration, orchestration. And so you can see that as a spectrum where many of them are experimenting simple task based AI agents are being implemented in production. And as the technology continues to mature, were going to see even the more complex cases of orchestration, collaboration and automation continue to mature into production, whereby in the next couple years, we believe that over 50% of organizations will have these AI agents embedded in their essential systems.Thomasadded, To give you a specific example, about a year and a half ago we partnered with Dunn and Bradstreet to build an assistant called Ask procurement. This was using procurement data and company data and shipping data from Dun and Bradstreet. We constructed this application which leverages orchestrate and Watson x and is able gives a procurement agent the ability to ask questions and to figure out the best place to source something; weve now started working on, how do we make that agentic? Because today its really just an AI, Q and A, and making it agentic would be actually linking it to if Im choosing a supplier, what can I get in terms of actual delivery times, as I look at shipping schedules, shipping lanes, whatever may be. So I would say the technology is there. This is about the change management and then applying it to these specific problems. But for most companies, I think often starting with an assistant and then moving towards agentic workflows is sometimes the fastest, lowest risk way to do it.Link to IBM press announcement,https://www.hpcwire.com/off-the-wire/ibm-accelerates-enterprise-gen-ai-revolution-with-hybrid-capabilities.Link to THINK,https://www.ibm.com/events/think.This article first appeared on HPCwire.0 Commenti 0 condivisioni 2 Views
-
WWW.AIWIRE.NETQED-C Workshop Identifies Quantum AI TargetsHow, exactly, will quantum computing and AI work together? Following the flood of marketing enthusiasm for Quantum AI during the past couple of years, the Quantum Economic Development Consortium (QED-C) has released a report (Quantum Computing and Artificial Intelligence Use Cases) based a a QED-C workshop topic held last October.While not especially granular, theQED-Creport provides a good overview for understanding Quantum AI, or how the two disciplines may work together. For example, theres a section on Quantum AI applications in chemistry and material science modeling; optimization in logistics and energy; weather modeling and environmental science; and signal processing and quantum sensing.Broadly, the report was prepared by members of the Quantum + AI use case technical committee: Carl Dukatz, Accenture; Pau Farr, D-Wave; Kevin Glynn, Northwestern University; Kenny Heitritter, qBraid; Tom Lubinski, Quantum Circuits Inc.; Rima Oueid, Department of Energy; Travis Scholten, IBM; Allison Schwartz, D-Wave; Keeper Sharkey, ODE, L3C; and Davide Venturelli, USRAThe report focuses on four topics:Novel solutions or applications that could emerge from the synergy of QC and AI that are currently not feasible with classical computing approaches.Approaches for which AI could be used to identify use cases for QC.Opportunities to use AI technologies to accelerate the development of specific QC technologies or the quantum ecosystem at large.The technical advances needed for QC + AI integration in possible areas of their joint application.Heres an excerpt: Though independent technologies, QC and AI can complement each other in many significant and multidirectional ways. For example, AI could assist QC by accelerating the development of circuit design, applications, and error correction and generating test data for algorithm development. QC can solve certain types of problems more efficiently, such as optimization and probabilistic tasks, potentially enhancing the ability of AI models to analyze complex patterns or perform computations that are infeasible for classical systems. A hybrid approach integrating the strengths of classical AI methods with the potential of QC algorithms leverages the two technologies to substantially reduce algorithmic complexity, improving the efficiency of computational processes and resource allocation.QED-C Executive Director Celia Merzbacher said, Simultaneous advances in quantum computing and AI offer advantages for both fields, individually and collectively. QED-C looked at the potential in the context of practical applications and use cases. At this early stage, industry academia and governments must collaborate to make the most of this opportunity.The QED-Cs report makes three overarching recommendations:Include support for QC + AI in federal quantum and AI initiatives;Increase QC + AI research and education in academia;Connect industries to accelerate QC + AI technology development, demonstration, and adoption.Translating these calls to action may be difficult in the current environment of science budget slashing. (The full text of the recommendations is included at the end of the article.)One of the more active areas highlighted by the report is using AI to accelerate the development of quantum technology itself. Indeed theres widespread consensus that this use case represents low-hanging fruit. The report singles out the following areas in which AI could help accelerate quantum development:AI could assist QC software and algorithm developers and optimize QC hardware design, including qubits and quantum circuits. Some microchip designers already use AI to develop advanced semiconductors, suggesting a natural extension of AIs role into QC hardware. AI could also help enhance the design of hardware components for quantum networks.AI could QC help design and refine QC algorithms to improve their efficiency and performance.Software developers can leverage code assistants trained on QC software development kits to both accelerate code development and increase the number of developers capable of programming such computers.AI could address critical QC challenges, such as error correction (e.g., by dynamical optimization of error correction codes based on real-time noise profiles) and noise reduction (e.g., by analysis of patterns of noise).Overall, the report could have been strengthened with discussion of a few specific cases histories given that QED-Cs membership is likely to have such practical experience, if only in POC efforts. Its likely the desire was not to spotlight any particular companys efforts. The full report was first released just to QED-C members in March but this week it was made available to the broader public.Link to report: https://quantumconsortium.org/quantum-computing-and-artificial-intelligence-use-cases.QEDC Quantum + AI Report Recommendations1.Include support for QC + AI in federal quantum and AI initiatives:The federal government invests in a substantial and broad portfolio of quantum technology R&D, guided by the National Quantum Initiative (NQI) Act, CHIPS and Science Act, and other legislation. Federal agencies should explicitly include support for R&D for QC + AI hybrid technologies, including for heterogeneous computing environments that comprise multiple computing paradigms, such as quantum processing units, central processing units, graphical processing units, neuromorphic computing et al.Federal support for QC + AI R&D should also foster infrastructure and programs that bring experts together to share knowledge and learning. For example, heterogeneous computing testbeds at national labs that are open to the broad research community could support cross-sector applied research aimed at practical application. In fact, the NQI established several national quantum centers, many of which include testbeds, and these should be expanded to explore QC + AI technologies. Specific support is needed for testbeds that facilitate integration of QC with other technologies.Non-quantum testbeds could also be encouraged to explore potential integration of QC technologies. For example, federally funded testbeds for grid resilience and advanced manufacturing could explore how QC + AI could benefit those fields. The NSFs National AI Research Institutes could include a focus on using AI to develop new QC algorithms, which could in turn advance both QC and AI. Cross-sector collaboration and integration of different technologies are critical for staying at the forefront of QC R&D and increasing opportunities for QC + AI technology deployment.Finally, the Quantum User Expansion for Science and Technology (QUEST) program authorized by the CHIPS and Science Act provides researchers from academia and the private sector access to commercial quantum computers. QUEST could include support for research specifically on QC + AI.2. Increase QC + AI research and education in academia:AI is currently a trendy field, attracting many community college and university students to software and computer science degrees. At the same time, QC is attracting interest among physical science and engineering students. This large pool can be leveraged to advance QC + AI technologies. For example, higher education institutions can introduce more students to both fields by offering interdisciplinary courses involving physics, math, engineering and computer science. To better prepare students for careers in industry and to build AI capacity at QC companies, universities could partner with QC companies to provide internships and hands-on training. Such a program exists in Canada1and would be a worthwhile addition to US efforts.Government funding agencies such as NSF, DOE, and DARPA could also encourage multidisciplinary QC + AI research by creating programs that fund teams of QC and AI researchers to collaborate. For example, multidisciplinary teams could research classical algorithms to drive efficiencies in real-world quantum use cases or large-scale methods for error correction. The Materials Genome Project that funded experimental, theoretical, and computation research by multidisciplinary teams is an example of such an approach. Agencies might need to create mechanisms to bridge program offices to ensure multidisciplinary program funding and management.3. Connect industries to accelerate QC + AI technology development, demonstration, and adoption:While AI is being adopted by seemingly every industry, QC + AI is still relatively early-stage, and awareness among end users is low. Better engagement and interaction among the developers of QC and AI and with end users is needed to enable creation of new capabilities, products, and services that provide real business value. QC and AI industry consortia, such as QED-C and the AI Alliance, should join forces to raise awareness among their members, create opportunities for collaboration, and identify gaps that government funding could help to fill. Together these groups can also engage end user communities to identify sectors that could be early adopters and partners to drive initial applications.Early applications will feed into additional and broader use cases, eventually reaching an inflection similar to that experienced by AI, after which QC + AI uses will grow exponentially. Hackathons and business-focused QC + AI challenges could push knowledge sharing and spur interest.Within government, there are opportunities to promote QC + AI development to achieve the goals of programs aimed at industries from advanced manufacturing to microelectronics. For example, Manufacturing USA funds 18 advanced manufacturing institutes that aim to develop diverse manufacturing capabilities. QC + AI has the potential to disrupt and allow for manufacturing innovation and could be infused into many of the institutes R&D programs. Similarly, the CHIPS R&D program seeks to develop capabilities for future chip technologies. In the 510-year timeframe, QC + AI will be poised to impact the traditional semiconductor-basedcomputing ecosystem. The CHIPS R&D program needs to include QC + AI research to ensure this emerging technology is seamlessly incorporated into future microelectronics technologies.This article first appeared on HPCwire.0 Commenti 0 condivisioni 2 Views
-
WWW.AIWIRE.NETThree Ways AI Can Weaken Your CybersecurityEven before generative AI arrived on the scene, companies struggled to adequately secure their data, applications, and networks. In the never-ending cat-and-mouse game between the good guys and the bad guys, the bad guys win their share of battles. However, the arrival of GenAI brings new cybersecurity threats, and adapting to them is the only hope for survival.Theres a wide variety of ways that AI and machine learning interact with cybersecurity, some of them good and some of them bad. But in terms of whats new to the game, there are three patterns that stand out and deserve particular attention, including slopsquatting, prompt injection, and data poisoning.SlopsquattingSlopsquatting is a fresh AI take on typosquatting, where neer-do-wells spread malware to unsuspecting Web travelers who happen to mistype a URL. With slopsquatting, the bad guys are spreading malware through software development libraries that have been hallucinated by GenAI.Slopsquatting is a new way to compromise AI systems. (Source: flightofdeath/shutterstock)We know that large language models (LLMs) are prone to hallucinations. The tendency to create things out of whole cloth is not so much a bug of LLMs, but a feature thats intrinsic to the way LLMs are developed. Some of these confabulations are humorous, but others can be serious. Slopsquatting falls into the latter category.Large companies have reportedly recommended Pythonic libraries that have been hallucinated by GenAI. Ina recent story inThe Register, Bar Lanyado, security researcher atLasso Security, explained that Alibaba recommended users install a fake version of the legitimate library called huggingface-cli.While it is still unclear whether the bad guys have weaponized slopsquatting yet, GenAIs tendency to hallucinate software libraries is perfectly clear. Last month, researchers published a paper that concluded that GenAI recommends Python and JavaScript libraries that dont exist about one-fifth of the time.Our findings reveal that that the average percentage of hallucinated packages is at least 5.2% for commercial models and 21.7% for open-source models, including a staggering 205,474 unique examples of hallucinated package names, further underscoring the severity and pervasiveness of this threat, the researchers wrote in the paper, titledWe Have a Package for You! A Comprehensive Analysis of Package Hallucinations by Code Generating LLMs.Out of the 205,00+ instances of package hallucination, the names appeared to be inspired by real packages 38% of the time, were the results of typos 13% of the time, and were completely fabricated 51% of the time.Prompt InjectionJust when you thought it was safe to venture onto the Web, a new threat emerged: prompt injection.Like the SQL injection attacks that plagued early Web 2.0 warriors who didnt adequately validate database input fields, prompt injections involve the surreptitious injection of a malicious prompt into a GenAI-enabled application to achieve some goal, ranging from information disclosure and code execution rights.A list of AI security threats from OWASP. (Source: Ben Lorica)Mitigating these sorts of attacks is difficult because of the nature of GenAI applications. Instead of inspecting code for malicious entities, organizations must investigate the entirety of a model, including all of its weights. Thats not feasible in most situations, forcing them to adopt other techniques, says data scientist Ben Lorica.A poisoned checkpoint or a hallucinated/compromised Python package named in an LLMgenerated requirements file can give an attacker codeexecution rights inside your pipeline, Lorica writes in a recent installment of hisGradient Flownewsletter. Standard security scanners cant parse multigigabyte weight files, so additional safeguards are essential: digitally sign model weights, maintain a bill of materials for training data, and keep verifiable training logs.A twist on the prompt injection attack was recently described by researchers at HiddenLayer, who call their technique policy puppetry.By reformulating prompts to look like one of a few types of policy files, such as XML, INI, or JSON, an LLM can be tricked into subverting alignments or instructions, the researchers writein a summary of their findings. As a result, attackers can easily bypass system prompts and any safety alignments trained into the models.The company says its approach to spoofing policy prompts enables it to bypass model alignment and produce outputs that are in clear violation of AI safety policies, including CBRN (Chemical, Biological, Radiological, and Nuclear), mass violence, self-harm and system prompt leakage.Data PoisoningData lies at the heart of machine learning and AI models. So if a malicious user can inject, delete, or change the data that an organization uses to train an ML or AI model, then he or she can potentially skew the learning process and force the ML or AI model to generate an adverse result.Symptoms and remediations of data poisoning. (Source: CrowdStrike)A form of adversarial AI attacks, data poisoning or data manipulation, poses a serious risk to organizations that rely on AI. According to the security firm CrowdStrike, data poisoning is a risk to healthcare, finance, automotive, and HR use cases, and can even potentially be used to create backdoors.Because most AI models are constantly evolving, it can be difficult to detect when the dataset has been compromised, the company says ina 2024 blog post. Adversaries often make subtlebutpotent changes to the data that can go undetected. This is especially true if the adversary is an insider and therefore has in-depth information about the organizations security measures and tools as well as their processes.Data poisoning can be either targeted or non-targeted. In either case, there are telltale signs that security professionals can look for that indicate whether their data has been compromised.AI Attacks as Social EngineeringThese three AI attack vectorsslopsquatting, prompt injection, and data poisoningarent the only ways that cybercriminals can attack organizations via AI. But they are three avenues that AI-using organizations should be aware of to thwart the potential compromise of their systems.Unless organizations take pains to adapt to the new ways that hackers can compromise systems through AI, they run the risk of becoming a victim. Because LLMs behave probabilistically instead of deterministically, they are much more liable to social engineering types of attacks than traditional systems, Lorica says.The result is a dangerous security asymmetry: exploit techniques spread rapidly through open-source repositories and Discord channels, while effective mitigations demand architectural overhauls, sophisticated testing protocols, and comprehensive staff retraining, Lorica writes. The longer we treat LLMs as just another API, the wider that gap becomes.This article first appeared on BigDATAwire.0 Commenti 0 condivisioni 2 Views
-
WWW.AIWIRE.NETParallel Works Unveils ACTIVATE High Security Platform, Secures Key DoD AccreditationParallel Works has just launched its new ACTIVATE High Security Platform, a hybrid multi-cloud computing control plane designed to meet the Department of Defenses (DoD) stringent security needs.The platform, which has achieved Impact Level 5 (IL5) Provisional Authorization from the Defense Information Systems Agency (DISA), offers DoD agencies and contractors a faster, more affordable path to securely running critical HPC and AI workloads across public and private cloud environments.The launch positions Parallel Works among a small group of providers authorized to handle Controlled Unclassified Information (CUI) and International Traffic in Arms Regulations (ITAR) data, a category that includes tech giants such as AWS, Google, and Oracle. It also marks a rare entry for a specialized control plane vendor focused specifically on hybrid HPC and AI workloads.ACTIVATE HSP addresses a major bottleneck in the defense sector: the slow and costly security accreditation process that has prevented wider adoption of cloud computing. By providing a pre-accredited IL5 environment, the new platform will allow agencies and contractors to bypass months of security work and begin running sensitive workloads with minimal setup.Meeting an Overdue Need in Defense Cloud AdoptionDefense organizations are turning to cloud computing to meet modernization goals such as scaling compute capacity quickly, accelerating workload deployment, and gaining faster access to high-performance AI and simulation tools. These priorities are central to initiatives like the Joint Warfighter Cloud Capability (JWCC) program. However, actually using cloud services for classified or sensitive projects has remained difficult.(Source: Parallel Works)In an interview, Parallel Works CEO Matthew Shaxted told AIwire that major cloud providers like AWS, Azure, Google, and Oracle would probably agree that the DoDand the defense industrial base in generalisnt leveraging cloud to its full potential. The accreditation barrier is impeding wide adoption. It forms a valley of death for using cloud in the DoD.That challenge remains even after initial cloud accounts are set up. Even when cloud providers like AWS or Azure receive blanket security authorizations, end users still face a steep climb out of this valley of death. Setting up an account is relatively easy, but preparing it for real work requires configuring it to meet IL5 or similar standards, passing audits, and obtaining an Authority to Operate (ATO), which typically requires eighteen months or more, plus significant financial and labor investments.Parallel Works designed ACTIVATE HSP specifically to eliminate these hurdles. Customers can operate within a fully accredited IL5 environment without taking on the overhead of building and certifying their own. The platform provides immediate access to IL5-compliant cloud infrastructure while allowing organizations to retain total control over their data, workloads, and systems.What ACTIVATE HSP DoesFounded in 2015 and spun out of Argonne National Laboratory, Parallel Works developed its original ACTIVATE platform to simplify the management of complex hybrid computing environments. Initially designed to streamline batch scheduler systems, ACTIVATE has evolved into a unified control plane that connects users to many different cloud and on-premises HPC and AI resources.(Source: Parallel Works)ACTIVATE HSP builds on this foundation, offering a high-security version that supports deployment across AWS GovCloud, with support for Azure Government and Google Cloud Platform expected in late summer. Users can burst workloads across multiple cloud providers while maintaining strict IL5 compliance. The platform also links to Defense Supercomputing Resource Centers (DSRCs), so users can combine local HPC systems with cloud-based resources.Parallel Works also built real-time cost controls into ACTIVATE HSP. Instead of waiting for delayed billing reports from cloud providers, the platform monitors spending every three minutes and can enforce budget limits immediately. This helps prevent the budget overruns that have historically plagued government cloud projects due to billing cycle delays.In the interview, Shaxted explained that organizations have two options for using ACTIVATE HSP. Some, like the DoDs High Performance Computing Modernization Program (HPCMP), are offering it as a shared service for other agencies. Others can inherit the accredited package to set up their own dedicated environments, cutting the time needed to achieve operational approval.Why IL5 Accreditation MattersImpact Level 5 is the DoDs standard for systems that manage sensitive national security information but remain connected to the public internet. Achieving IL5 compliance requires meeting a long list of security controls, drawn from FedRAMP High standards with additional DoD-specific requirements.Parallel Works spent nearly three years on the accreditation process. The effort involved satisfying more than 400 security controls, compiling a documentation package exceeding 2,000 pages, and undergoing third-party and DISA audits.IL5 authorization allows users to store, process, and transmit CUI and ITAR-controlled data without requiring isolated, air-gapped environments. This distinction is critical because it enables more flexible computing architectures while still satisfying mission security requirements. By inheriting a fully audited environment, users can avoid the time and expense typically required to secure their own ATO. In many cases, projects that would have needed 18 months of security work can begin operating in a matter of days or weeks.What ACTIVATE HSP Means for DoD Project TeamsThe benefits of the ACTIVATE High Security Platform are practical: reduced costs, flexible access to secure compute, and deployment timelines measured in days, not months. Take, for example, a DoD project team preparing to run AI-enabled image analysis workflows on satellite data. Without a pre-authorized environment, the team would face months of system hardening, documentation, security reviews, and risk management evaluations, often before uploading a single dataset. These delays can stall mission timelines and slow the delivery of critical insights to decision-makers.With ACTIVATE HSP, the same team could inherit a compliant environment, configure resources through a familiar interface, and begin training models within days. They can also scale up as needed across cloud and on-premises infrastructure, without being limited by local capacity or held back by compliance barriers.The cost difference is just as stark. Building an IL5-compliant system independently can run into the millions, factoring in consulting fees, security tooling, and internal labor. By inheriting Parallel Works existing accreditation package, users reduce that overhead to the baseline costs of platform access and cloud consumption.Expanding Access to Secure Cloud InfrastructureThe launch of ACTIVATE HSP comes at a time when the DoD is facing growing pressure to modernize its digital infrastructure. As AI, digital twins, and autonomous systems become more central to defense operations, the demand for computing environments that combine performance with strong security continues to rise.Parallel Works CEO Matthew ShaxtedIn the near term, Parallel Works is focused on onboarding new users through shared environments such as the HPCMPs implementation. Support for Azure Government and Google Cloud is expected later this summer, giving defense teams the ability to choose the cloud providers that best meet their needs.One may wonder, how did a small company end up building one of the only IL5-authorized control planes available to the DoD? The origin of ACTIVATE HSP traces back to a competitive solicitation issued by the Defense Innovation Unit (DIU) three years ago. DIU, which helps adapt commercial technology for military use, had called for hybrid computing solutions that could address emerging security and scalability demands. Parallel Works original ACTIVATE platform aligned with the technical vision but lacked the high-security capabilities needed for sensitive workloads.Over the next three years, the company worked closely with defense stakeholders to develop, test, and certify what became ACTIVATE HSP. They really saw the need first, Shaxted said. We spent the last three years making it real.Parallel Works executives will present the platform next week at Special Operations Forces (SOF) Week in Tampa, Florida, May 5-8, 2025, where they will meet with prospective users from across the defense and national security ecosystem. More information about the platform is available at this link.0 Commenti 0 condivisioni 2 Views
-
WWW.AIWIRE.NETIts Time to Get Comfortable with Uncertainty in AI Model TrainingIts obvious when a dog has been poorly trained. It doesnt respond properly to commands, pushes boundaries, and behaves unpredictably.The same is true with a poorly trained artificial intelligence (AI) model. Only with AI, its not always easy to identify what went wrong with the training.Research scientists globally are working with a variety of AI models trained on experimental and theoretical data. The goal is to predict a materials properties before creating and testing it. They are using AI to design better medicines and industrial chemicals in a fraction of the time it takes for experimental trial and error.But how can they trust the answers that AI models provide? Its not just an academic question. Millions of investment dollars can ride on whether AI model predictions are reliable.A research team from the Department of EnergysPacific Northwest National Laboratoryhas developed a method to determine how well a class of AI models called neural network potentials has been trained. Further, it can identify when a prediction is outside the boundaries of its training and where it needs more training to improvea process called active learning.The research team, led by PNNL data scientistsJenna BilbreyPope andSutanay Choudhury, describes how the new uncertainty quantification method worksin a research article published in NPJ Computational Materials.A dog that has been poorly trained is like an AI model that has been poorly trained. It doesnt know its boundaries. (Source: Jaromir Chalabala/Shutterstock) [Editors Note: But hes a good boy.]The research team, led by PNNL data scientistsJenna BilbreyPope andSutanay Choudhury, describes how the new uncertainty quantification method worksin a research article published in NPJ Computational Materials. The team is also makingthe method publicly available on GitHubas part of its larger repository, Scalable Neural Network Atomic Potentials (SNAP), to anyone who wants to apply it to their own work.We noticed that some uncertainty models tend to be overconfident, even when the actualerror in predictionis high, said Bilbrey Pope. This is common for most deep neural networks. However, a model trained with SNAP gives a metric that mitigates this overconfidence. Ideally, youd want to look atboth prediction uncertaintyand training data uncertainty to assess your overall model performance.Instilling trust in AI model training to speed discoveryResearch scientists want to take advantage of AIs speed of predictions, but right now, theres a tradeoff between speed and accuracy. An AI model can make predictions in seconds that might take a supercomputer 12 hours to compute using traditional computationally intensive methods. However, chemists and materials scientists still see AI as a black box.The PNNL data science teams uncertainty measurement provides a way to understand how much they should trust an AI prediction.AI should be able to accurately detect its knowledge boundaries, said Choudhury. We want our AI models to come with a confidence guarantee. We want to be able to make statements such as This prediction provides 85% confidence that catalyst A is better than catalyst B, based on your requirements.'In their published study, the researcherschose to benchmarktheir uncertainty method with one of the most advanced foundation models for atomistic materials chemistry, called MACE. The researchers calculated how well the model is trained to calculate the energy of specific families of materials. These calculations areimportantto understanding how well the AI model can approximate the more time- and energy-intensive methods that run on supercomputers. The results show what kinds of simulations can be calculated with confidence that the answers are accurate.This kind of trust and confidence in predictions is crucial to realizing the potential of incorporating AI workflows into everyday laboratory work and the creation of autonomous laboratories where AI becomes a trusted lab assistant, the researchers added.We have worked to make it possible to wrap any neural network potentials for chemistry into our framework, said Choudhury. Then in a SNAP, they suddenly have the power of being uncertainty aware.Now, if only puppies could be trained in a snap.In addition to Bilbrey and Choudhury, PNNL data scientistsJesun S. FirozandMal-Soon Leecontributed to the study. This work was supported by theTransferring exascale computational chemistry to cloud computing environment and emerging hardware technologies (TEC4) project, which is funded by the DOE Office of Science, Office of Basic Energy Sciences.About PNNLPacific Northwest National Laboratorydraws on its distinguishing strengths in chemistry, Earth sciences, biology and data science to advance scientific knowledge and address challenges inenergy resiliency and national security.Founded in 1965, PNNL is operated by Battelle and supported by the Office of Science of the U.S. Department of Energy. The Office of Science is the single largest supporter of basic research in the physical sciences in the United States and is working to address some of the most pressing challenges of our time. For more information, visit theDOE Office of Science website.For more information on PNNL, visitPNNLs News Center. Follow us onTwitter,Facebook,LinkedInandInstagram.Note: This article was initially posted on thePNNL News Siteand is reproduced here with permission.Karyn Hedeis aSenior Science Communicator and Media Relations Advisor at PNNL.0 Commenti 0 condivisioni 4 Views