FreshRSS

🔒
❌ À propos de FreshRSS
Il y a de nouveaux articles disponibles, cliquez pour rafraîchir la page.
À partir d’avant-hierVos flux RSS

Gate Drive Measurement Considerations

One of the primary purposes of a gate driver is to enable power switches to turn on and off faster, improving rise and fall times. Faster switching enables higher efficiency and higher power density, reducing losses in the power stage associated with high slew rates. However, as slew rates increase, so do measurement and characterization uncertainty. 

Effective measurement and characterization considerations must account for: ► Proper gate driver design – Accurate timing (propagation delay in regard to skew, PWD, jitter) – Controllable gate rise and fall times  – Robustness against noise sources (input glitches and CMTI) ► Minimized noise coupling ► Minimized parasitic inductance

The trend for silicon based power designs over wide bandgap power designs makes measurement and characterization a greater challenge. High slew rates in SiC and GaN devices present  designers with hazards such as large overshoots and ringing, and potentially large  unwanted voltage transients that can cause spurious switching of the MOSFETs.

photo

Researchers Can Make AI Forget You

Par Matthew Hutson

Whether you know it or not, you’re feeding artificial intelligence algorithms. Companies, governments, and universities around the world train machine learning software on unsuspecting citizens’ medical records, shopping history, and social media use. Sometimes the goal is to draw scientific insights, and other times it’s to keep tabs on suspicious individuals. Even AI models that abstract from data to draw conclusions about people in general can be prodded in such a way that individual records fed into them can be reconstructed. Anonymity dissolves.

To restore some amount of privacy, recent legislation such as Europe’s General Data Protection Regulation and the California Consumer Privacy Act provides a right to be forgotten. But making a trained AI model forget you often requires retraining it from scratch with all the data but yours. This process that can take weeks of computation.

Two new papers offer ways to delete records from AI models more efficiently, possibly saving megawatts of energy and making compliance more attractive. “It seemed like we needed some new algorithms to make it easy for companies to actually cooperate, so they wouldn’t have an excuse to not follow these rules,” said Melody Guan, a computer scientist at Stanford and co-author of the first paper.

Because not much has been written about efficient data deletion, the Stanford authors first aimed to define the problem and describe four design principles that would help ameliorate it. The first principle is “linearity”: Simple AI models that just add and multiply numbers, avoiding so-called nonlinear mathematical functions, are easier to partially unravel. The second is “laziness,” in which heavy computation is delayed until predictions need to be made. The third is “modularity”: If possible, train a model in separable chunks and then combine the results. The fourth is “quantization,” or making averages lock onto nearby discrete values so removing one contributing number is unlikely to shift the average.

The Stanford researchers applied two of these principles to a type of machine learning algorithm called k-means clustering, which sorts data points into natural clusters—useful for, say, analyzing genetic differences between closely related populations. (Clustering has been used for this exact task on a medical database called the UK Biobank, and one of the authors has actually received a notice that some patients had asked for their records to be removed from that database.) Using quantization, the researchers developed an algorithm called Q-k-means and tested it on six datasets, categorizing cell types, written digits, hand gestures, forest cover, and hacked Internet-connected devices. Deleting 1,000 data points from each set, one point at a time, Q-k-means was 2 to 584 times as fast as regular k-means, with almost no loss of accuracy.

Using modularization, they developed DC-k-means (for Divide and Conquer). The points in a dataset are randomly split into subsets, and clustering is done independently within each subset. Then those clusters are formed into clusters, and so on. Deleting a point from one subset leaves the others untouched. Here the speedup ranged from 16 to 71, again with almost no loss of accuracy. The research was presented last month at the Neural Information Processing Systems (NeurIPS) conference, in Vancouver, Canada.

“What’s nice about the paper is they were able to leverage some of the underlying aspects of this algorithm”—k-means clustering—said Nicolas Papernot, a computer scientist at the University of Toronto and Vector Institute, who was not involved in the work. But some of the tricks won’t work as well with other types of algorithms, such as the artificial neural networks used in deep learning. Last month, Papernot and collaborators posted a paper on the preprint server arXiv presenting a training approach that can be used with neural networks, called SISA training (for Sharded, Isolated, Sliced, and Aggregated).

The approach uses modularity in two different ways. First, sharding breaks the dataset into subsets, and copies of the model are trained independently on each. When it comes time to make a prediction, the predictions of each model are aggregated into one. Deleting a data point requires retraining only one model. The second method, slicing, further breaks up each subset. The model for that subset trains on slice 1, then slices 1 and 2, then 1 and 2 and 3, and so on, and the trained model is archived after each step. If you delete a data point from slice 3, you can revert to the third stage of training and go from there. Sharding and slicing “give us two knobs to tune how we train the model,” Papernot says. Guan calls their methods “pretty intuitive,” but says they use “a much less stringent standard of record removal.” 

The Toronto researchers tested the method by training neural networks on two large datasets, one containing more than 600,000 images of home address numbers, and one containing more than 300,000 purchase histories. When deleting 0.001 percent of each dataset and then retraining, sharding (with 20 shards) made retraining go 3.75 times as fast for the addresses and 8.31 times as fast for the purchases (compared with training a model in the standard fashion and then retraining it from scratch without the deleted data points), with little reduction in accuracy. Slicing further increased speed by 18 percent for addresses and 43 percent for purchases, with no reduction in accuracy.

Deleting only 0.001 percent might not seem like much, but, Papernot says, it’s orders of magnitude more than the amount requested of services like Google search, according to publicly released figures. And an 18 percent speedup might not seem dramatic, but for giant models, that improvement can save lots of time and money. Further, in some cases you might know that certain data points are more likely to require forgetting—perhaps they belong to ethnic minorities or people with medical conditions, who might be more concerned about privacy violations. Concentrating these points in certain shards or slices can make deletion even more efficient. Papernot says they’re looking at ways to use knowledge of a dataset to better tailor SISA.

Certain AI methods aim to anonymize records, but there are reasons one might want AI to forget individual data points besides privacy, Guan says. Some people might not want to contribute to the profits of a disliked company—at least without profiting from their own data themselves. Or scientists might discover problems with data points post-training. (For instance, hackers can “poison” a dataset by inserting false records.) In both cases, efficient data deletion would be valuable.

“We certainly don’t have a full solution,” Guan says. “But we thought it would be very useful to define the problem. Hopefully people can start designing algorithms with data protection in mind.”

How to Improve Security Visibility and Detection-Response Operations in AWS

Security teams often handle a large stream of alerts, creating noise and impairing their ability to determine which incidents to prioritize. By aggregating security information from various sources and automating incident response, organizations can increase visibility into their environment and focus on the most important potential threats. In this webinar, SANS and AWS Marketplace explore how organizations can leverage solutions to create more signal and less noise for actionable responses, enhancing and accelerating security operations.

Register today to be among the first to receive the associated whitepaper written by SANS Analyst and Senior Instructor Dave Shackleford.

Attendees will learn to:

  • Use continuous monitoring to gain insight into events and behaviors that move into and through your cloud environment
  • Integrate security incident and event management (SIEM) solutions to enhance detection and investigation of potential threats
  • Leverage security orchestration automation and response (SOAR) technologies to auto-remediate events and reduce noise in your environment

Neural Networks Can Drive Virtual Racecars Without Learning

Par Matthew Hutson

Animals are born with innate abilities and predispositions. Horses can walk within hours of birth, ducks can swim soon after hatching, and human infants are automatically attracted to faces. Brains have evolved to take on the world with little or no experience, and many researchers would like to recreate such natural abilities in artificial intelligence.

New research finds that artificial neural networks can evolve to perform tasks without learning. The technique could lead to AI that is much more adept at a wide variety of tasks such as labeling photos or driving a car.

Artificial neural networks are arrangements of small computing elements (“neurons”) that pass information between them. The networks typically learn to perform tasks like playing games or recognizing images by adjusting the “weights” or strengths of the connections between neurons. A technique called neural architecture search tries lots of network shapes and sizes to find ones that learn better for a specific purpose.

The new method uses this same search technique to find networks for which the weights don’t matter. For such a network, the network’s overall shape drives its intelligence—and could make it particularly well-suited to certain tasks.

“If animals have all these innate behaviors, and some neural networks can do well without a lot of training, we wondered how far we could push that idea,” said Adam Gaier, a computer scientist who was the paper’s lead author while working at Google Brain.

The process begins with a set of very simple networks that link inputs—say, data from a robot’s sensors—to behavioral outputs. It evaluates the nets’ performance on a given task, keeps the networks that performed best, and mutates them, by adding a neuron, adding a link, or changing how sensitive a neuron is to the sum of its inputs. In the evaluation phase, a shared random number is assigned to all of a network’s weights. (This is actually done for several random numbers, and the results are averaged.)

The results are called Weight Agnostic Neural Networks (WANNs). These networks get points for performing well on the task and also for being simple. While typical networks for the tasks in this study might have thousands of neurons and weights, the WANNs had only a handful of neurons and a single weight.

Somehow the WANNs still performed respectably. The research team compared them with standard network architectures whose weights were refined through experience to master three simulated tasks: driving a racecar, making a bipedal robot walk, and controlling a wheeled cart to balance a pole.

An illustrated figure with two legs appears next to a map of a neural network and its many connections.
Image: Google Brain
A minimal architecture discovered in earlier generations is capable of controlling the Bipedal Walker shown here as it moves forward, despite not achieving an excellent score. 

WANNs achieved scores ranging from about a sixth to half those of the trained nets. When the researchers assigned the best-performing weight instead of a random one, those numbers ranged from two thirds to four fifths the trained nets’ scores. And if, after evolution, the WANNs were trained in the same way as the much larger standard networks, their performance was on par.

On a task that involved recognizing written digits, WANNs achieved greater than 90 percent accuracy (versus 99 percent for a larger network trained on the task). The research was presented last month at the Neural Information Processing Systems (NeurIPS) conference, in Vancouver, Canada.

“The fact that they make the whole thing work is very impressive,” said Rosanne Liu, a computer scientist at Uber AI Labs who was not involved with the research. Others have tried and failed to develop networks that don’t depend on weights. Gaier says the breakthrough was originally a bug that assigned the same number to all weights, which ended up simplifying the architecture search. 

While the WANNs’ performance didn’t supersede that of larger trained networks, the method opens a new pathway for finding network architectures specially adapted for various tasks, in the way that parts of the brain are wired differently for specific purposes. Convolutional neural networks, featuring an architecture catered for image recognition, mirror the structure of the brain’s visual cortex, for example. Gaier believes many more building blocks may be out there, ready to make AIs smart from birth.

Will China Attain Exascale Supercomputing in 2020?

Par Mark Anderson
Photo-illustration depicting four country flags as martial artists.
Photo-Illustration: Edmon de Haro
graphic link to special report landing page

To the supercomputer world, what separates “peta” from “exa” is more than just three orders of magnitude.

As measured in floating-point operations per second (a.k.a. FLOPS), one petaflop (1015 FLOPS) falls in the middle of what might be called commodity high-performance computing (HPC). In this domain, hardware is hardware, and what matters most is increasing processing speed as cost-effectively as possible.

Now the United States, China, Japan, and the European Union are all striving to reach the exaflop (1018 ) scale. The Chinese have claimed they will hit that mark in 2020. But they haven’t said so lately: Attempts to contact officials at the National Supercomputer Center, in Guangzhou; Tsinghua University, in Beijing; and Xi’an Jiaotong University yielded either no response or no comment.

It’s a fine question of when exactly the exascale barrier is deemed to have been broken—when a computer’s theoretical peak performance exceeds 1 exaflop or when its maximum real-world compute speed hits that mark. Indeed, the sheer volume of compute power is less important than it used to be.

“Now it’s more about customization, special-purpose systems,” says Bob Sorensen, vice president of research and technology with the HPC consulting firm Hyperion Research. “We’re starting to see almost a trend towards specialization of HPC hardware, as opposed to a drive towards a one-size-fits-all commodity” approach.

The United States’ exascale computing efforts, involving three separate machines, total US $1.8 billion for the hardware alone, says Jack Dongarra, a professor of electrical engineering and computer science at the University of Tennessee. He says exascale algorithms and applications may cost another $1.8 billion to develop.

And as for the electric bill, it’s still unclear exactly how many megawatts one of these machines might gulp down. One recent ballpark estimate puts the power consumption of a projected Chinese exaflop system at 65 megawatts. If the machine ran continuously for one year, the electricity bill alone would come to about $60 million.

Dongarra says he’s skeptical that any system, in China or anywhere else, will achieve one sustained exaflop anytime before 2021, or possibly even 2022. In the United States, he says, two exascale machines will be used for public research and development, including seismic analysis, weather and climate modeling, and AI research. The third will be reserved for national-security research, such as simulating nuclear weapons.

“The first one that’ll be deployed will be at Argonne [National Laboratory, near Chicago], an open-science lab. That goes by the name Aurora or, sometimes, A21,” Dongarra says. It will have Intel processors, with Cray developing the interconnecting fabric between the more than 200 cabinets projected to house the supercomputer. A21’s architecture will reportedly include Intel’s Optane memory modules, which represent a hybrid of DRAM and flash memory. Peak capacity for the machine should reach 1 exaflop when it’s deployed in 2021.

The other U.S. open-science machine, at Oak Ridge National Laboratory, in Tennessee, will be called Frontier and is projected to launch later in 2021 with a peak capacity in the neighborhood of 1.5 exaflops. Its AMD processors will be dispersed in more than 100 cabinets, with four graphics processing units for each CPU.

The third, El Capitan, will be operated out of Lawrence Livermore National Laboratory, in California. Its peak capacity is also projected to come in at 1.5 exaflops. Launching sometime in 2022, El Capitan will be restricted to users in the national security field.

China’s three announced exascale projects, Dongarra says, also each have their own configurations and hardware. In part because of President Trump’s China trade war, China will be developing its own processors and high-speed interconnects.

“China is very aggressive in high-performance computing,” Dongarra notes. “Back in 2001, the Top 500 list had no Chinese machines. Today they’re dominant.” As of June 2019, China had 219 of the world’s 500 fastest supercomputers, whereas the United States had 116. (Tally together the number of petaflops in each machine and the numbers come out a little different. In terms of performance, the United States has 38 percent of the world’s HPC resources, whereas China has 30 percent.)

China’s three exascale systems are all built around CPUs manufactured in China. They are to be based at the National University of Defense Technology, using a yet-to-be-announced CPU; the National Research Center of Parallel Computer Engineering and Technology, using a nonaccelerated ARM-based CPU; and the Chinese HPC company Sugon, using an AMD-licensed x 86 with accelerators from the Chinese company HyGon.

Japan’s future exascale machine, Fugaku, is being jointly developed by Fujitsu and Riken, using ARM architecture. And not to be left out, the EU also has exascale projects in the works, the most interesting of which centers on a European processor initiative, which Dongarra speculates may use the open-source RISC-V architecture.

All four of the major players—China, the United States, Japan, and the EU—have gone all-in on building out their own CPU and accelerator technologies, Sorensen says. “It’s a rebirth of interesting architectures,” he says. “There’s lots of innovation out there.”

Building a Quantum Computer From Off-the-Shelf Parts

Par Mark Anderson

A new technique for fabricating quantum bits in silicon carbide wafers could provide a scalable platform for future quantum computers. The quantum bits, to the surprise of the researchers, can even be fabricated from a commercial chip built for conventional computing.

The recipe was surprisingly simple: Buy a commercially available wafer of silicon carbide (a temperature-robust semiconductor used in electric vehicles, LED lights, solar cells, and 5G gear) and shoot an electron beam at it. The beam creates a deficiency in the wafer which behaves, essentially, as a single electron spin that can be manipulated electrically, magnetically, or optically.

“It’s ironic after 50 years or so of trying to clean up semiconductors to make high-quality electronics, our plan is to put the defects back in—and use them to make a trapped atom in a semiconductor,” says David Awschalom, professor of molecular engineering at the University of Chicago.

Awschalom says his group at Chicago is one of a number that have followed up on the promise of a pioneering 2011 paper by researchers at the University of California, Santa Barbara—who first discovered that small defects in silicon carbide could be manipulated to become essentially room-temperature cages for individual electrons, whose spins can then be used as a quantum bit for possible computations and communications.

And these individual electron spins inside silicon carbide, subsequent research has established, retain their quantum information for up to a millisecond (a long time in the world of quantum computing) and can be tuned and addressed both with electrical gates and with lasers.

The technique could offer a rare medium that’s isolated enough from thermal noise to host quantum phenomena like entanglement—but not so isolated that qubits can’t be manipulated and run through a series of gates and logical operations.

"Our approach is to see if we can leverage the trillion dollars or so of American industry that’s building today’s nanoelectronics and see if we can pivot that technology,” Awschalom says.

“We thought we’d just buy commercial devices, create defects in them and see how well they worked. We were fairly pessimistic, because the material wasn’t created for quantum information technologies,” he says. “You might think, ‘This can’t work.’ But this is the beauty of research, you try it anyway. And what we learned were a series of things we honestly didn’t expect.”

In other words, it worked. In their paper, published in a recent issue of the journal Science, the group reports the manufactured defects in their silicon carbide diodes produce a stable single electron pocket that holds together up to and well above room temperatures. 

Because of the configuration of the defects—having to do with a symmetry in the silicon carbide lattice—the individual electron spin can be manipulated not only by magnetic fields but also by electrical fields as well.

“The one thing we can do today, like in your smartphone, is make a lot of transistors that are controlled with electrical gates,” Awschalom says. “So if you can control quantum states and their magnetic properties with electric fields, there’s an advantage. Because there’s a pathway to scale them using today’s electronics technology.”

The other key finding in the group’s research, he says, is the possibility for tuning these electron spins to be addressed by laser pulses as well.

The researchers published another recent paper in the journal Science Advances that finds the same silicon carbide qubits could be a potential quantum communication media. The spins, that is, can be manipulated to be resonant with light across a broad range—as in 800 gigahertz of range—of frequencies. And the line width of those spins, Awschalom says, is pretty tight, too—just 20 megahertz.

This means that any individual qubit could potentially be tuned to communicate across one of some 40,000 separate frequency ranges—sort of like a quantum ham radio with some 40,000 individual channels.

“You can begin to think about quantum multiplexing in a commercial wafer,” Awschalom says.

To be clear, Awschalom’s group does not have anything close to a working quantum computer at the moment. They technically don’t even have a provably viable quantum bit yet—for example, one that can be reliably and repeatably taken through the paces of any quantum computation or communication protocol. Although they do have a candidate qubit and quantum computer technology with a fair amount of innate promise.

“We’re not building quantum machines with silicon carbide,” Awschalom says. “But what’s exciting in these early days, these commercial-grade materials have beautiful quantum properties, which now is an easily accessible playground for researchers. One of the exciting things about this discovery is people can go online, buy a wafer and start doing these measurements.”

This post was updated on 8 January 2020. 

Breaking Down Barriers in FPGA Engineering Speeds up Development

Par Digilent

image

It’s hard to reinvent the wheel—they’re round and they spin. But you can make them more efficient, faster, and easier for anyone to use. This is essentially what Digilent Inc. has done with its new Eclypse Z7 Field-Programmable Gate Array (FPGA) board. The Eclypse Z7 represents the first host board of Digilent’s new Eclyspe platform that aims at increasing productivity and accelerating FPGA system design.

To accomplish this, Digilent has taken the design and development of FPGAs out of the silo restricted to highly specialized digital design engineers or embedded systems engineers and opened it up to a much broader group of people that have knowledge of common programming languages, like C and C++. Additional languages like Python and LabVIEW are expected to be supported in future updates.

FPGAs have been a key tool for engineers to tailor a circuit board exactly the way it is needed to be for a particular application. To program these FPGAs specialized development tools are needed. Typically, the tool chain used for Xilinx FPGAs is a programming environment known as Vivado, provided by Xilinx, one of the original developers of FPGAs.

“FPGA development environments like Vivado really require a very niche understanding and knowledge,” said Steve Johnson, president of Digilent. “As a result, they are relegated to a pretty small cadre of engineers.”

Johnson added, “Our intent with the Eclypse Z7 is to empower a much larger number of engineers and even scientists so that they can harness the power of these FPGAs and Systems on a Chip (SoCs), which typically would be out of their reach. We want to broaden the customer base and empower a much larger group of people.”

Digilent didn’t just target relatively easy SoC chip devices. Instead, the company jumped into the deep end of the FPGA pool and focused on the development of a Zynq 7020 FPGA SoC from Xilinx, which has a fairly complex combination of a dual-core ARM processor with an FPGA fabric. This complex part presents even more of challenge for most engineers.

To overcome this complexity, Johnson explains that they essentially abstracted the complexity out of the system level development of Input/Output (I/O) modules by incorporating a software layer and FPGA “blocks” that serve as a kind of driver.

“You can almost think of it as when you plug a printer into a computer, you don't need to know all of the details of how that printer works,” explained Johnson. “We're essentially providing a low-level driver for each of these I/O modules so that someone can just plug it in.”

With this capability, a user can configure an I/O device that they just plugged in and start acquiring data from it, according to Johnson. Typically, this would require weeks of work involving the pouring over of data sheets and understanding the registers of the devices that you've plugged in. You would need to learn how to communicate with that device at a very low-level so that it was properly configured to move data back and forth. With the new Eclypse Z7 all of that trouble has been taken off the table.

image

Beyond the software element of the new platform, there's a focus on high-speed analog and digital I/O. This focus is partly due to Digilent’s alignment with its parent company—National Instruments—and its focus around automated measurements. This high-speed analog and digital I/O is expected to be a key feature for applications where FPGAs and SoCs are really powerful: Edge Computing. 

In these Edge Computing environments, such as in predictive maintenance, you need analog inputs to be able to do vibration or signal monitoring applications. In these types of applications you need high-speed analog inputs and outputs and a lot of processing power near the sensor.

The capabilities of these FPGA and SoC devices in Edge Computing could lead to applying machine learning or artificial intelligence to these devices, ushering in a convergence between two important trends – Artificial Intelligence (AI) and the Internet of Things (IoT) that’s coming to be known as the Artificial Intelligence of Things (AIoT), according to Johnson.

Currently, the FPGA and SoC platforms used in these devices can take advantage of 4G networks to enable Edge devices like those envisioned in AIoT scenarios. But this capability will be greatly enhanced when 5G networks are mature. At that time, Johnson envisions you'll just have a 5G module that you can plug into a USB or miniPCIe port on an Edge device.

“These SoCs—these ARM processors with the FPGAs attached to them—are exactly the right kind of architecture to do this low-power, small form factor, Edge Computing,” said Johnson. “The analog input that we're focusing on is intended to both sense the real world and then process and deliver that information. So they're meant exactly for that kind of application.”

This move by Digilent to empower a greater spectrum of engineers and scientists is in line with their overall aim of helping customers create, prototype and develop small, embedded systems—whether they are medical devices or edge computing devices.

Improving Codec Execution With ARM Cortex-M Processors

Digital Signal Processing (DSP) has traditionally required the use of an expensive dedicated DSP processor. While solutions have been implemented in microcontrollers using fixed-point math libraries for decades, this does require software libraries that can use more processing cycles than a processor capable of executing DSP instructions.

In this paper, we will explore how we can speed up DSP codecs using the DSP extensions built-in to the Arm Cortex-M processors.

You will learn:

  • The technology trends moving data processing to the edge of the network to enable more compute performance
  • What are the DSP extensions on the Arm Cortex-M processors and the benefits they bring, including cost savings and decreased system-level complexity
  • How to convert analog circuits to software using modeling software such as MathWorks MATLAB or Advanced Solutions Nederlands (ASN) filter designer
  • How to utilize the floating-point unit (FPU) with Cortex-M to improve performance
  • How to use the open-source CMSIS-DSP software library to create IIR and FIR filters in addition to calculating a Fast Fourier Transform (FFT)
  • How to implement the IIR filter that utilizes CMSIS-DSP using the Advanced Solutions Nederlands (ASN) designer

Network Monitoring With Influxdata

In this paper, we discuss how you can use InfluxDB to gain the necessary visibility in the status, performance and responsiveness of your environments.

photo

Monitoring Your Network with Time Series

Networks play a fundamental role in the adoption and growth of Internet applications. Penetrating enterprises, homes, factories, and even cities, networks sustain modern society. In this webinar, Daniella Pontes of InfluxData will explore the flexibility and potential use cases of open source and time series databases.

In this webinar you will:

-Learn how to use a time series database platform to monitor your network

-Understand the value of using open source tools

-Gain insight into what key aspects of network monitoring you should focus on

A simple guide to antenna selection for compliance testing

Download this whitepaper to elevate your understanding of the require antennas to perform compliance testing based on the required frequency range.
❌