Quantcast
Channel: Intel Developer Zone Articles
Viewing all 461 articles
Browse latest View live

Platform Analyzer - Analyzing Healthy and not-so Healthy Applications

$
0
0

Recently my wife purchased a thick and expensive book. As an ultrasonic diagnostician for children, she purchases many books, but this one had me puzzled.  The book was titled Ultrasound Anatomy of the Healthy Child.  Why would she need a book that showed only healthy children?  I asked her and her answer was simple: to diagnose any disease, even one not yet discovered, you need to know what a healthy child looks like. 

In this article we will act like doctors, analyzing and comparing a healthy and a not-so-healthy application.

Knock – knock – knock.

The doctor says: “It’s open, please enter.”

In walks our patient,  Warrior Wave*, an awesome game in which your hand acts as the road for the warriors to cross. It’s extremely fun to play, innovative, and uses Intel® RealSense™ technology. 

While playing the game, though, something felt a little off.  Something that I hadn’t felt before in other games based on Intel® RealSense™ technology.  The problem could be caused by so many things, but what is it in this case?  

Like any good doctor who is equipped with the latest and greatest analysis tools to diagnose the problem, we have the perfect tools to analyze our patient.

Using Intel® Graphics Performance Analyzer (Intel® GPA) Platform Analyzer, we receive a time-line view of our application’s CPU load, frame time, frames per second (FPS), and draw calls:

Let’s take a look.

Hmm… the first things that catch our eye are the regular FPS surges that occur periodically. All is relatively smooth for ~200 milliseconds and then jumps up and down severely.

For comparison, let’s look at a healthy FPS trace bellow. The game in this trace felt smooth and played well.  

No pattern was evident within the frame time, just normal random deviations.

But in our case we see regular surges. These surges happen around four times a second.  Let’s investigate the problem deeper, by zooming in on one of the surges and seeing what happening in the threads:

We can see that working thread 2780 spends most of the time in synchronization. The thread does almost nothing but wait for the next frame from the Intel® RealSense™ SDK:

At the same time, we see that rendering goes in another worker thread. If we scroll down, we find thread 2372.

Instead of “actively” waiting for the next frame from the Intel RealSense SDK, the game could be doing valuable work. Drawing and Intel® RealSense™ SDK work could be done in one worker thread instead of two, simplifying thread communication.

Excessive inter-thread communication can drastically slow down the execution and cause many problems.

Here is the example of a “healthy” game, where the Intel® RealSense™ SDK work and the DirectX* calls are in one thread. 

RealSense™ experts say: there is no point in waiting for the frames from the Intel® RealSense™ SDK. They won’t be ready any faster. 

But we can see that the main problem is at the top of the timeline.

On average, five out of six CPU frames did not result in a GPU frame. This is the cause of the slow and uneven GPU frame rate, which on average is less than 16 FPS.

Now let’s look at the pipeline to try and understand how the code is executing.  Looking at the amount of packets on “Engine 0,” the pipeline is filled to the brim, but the execution is almost empty.

The brain can process 10 to 12 separate images per second, perceiving them individually. This explains why the first movies were cut at a rate of 16 FPS: this is the average threshold at which the majority of people stop seeing a slide show and start seeing a movie.

Once again, let’s see the profile of the nice-looking game: 

Notice that the GPU frames follow the CPU frames with little shift. For every CPU frame, there is a corresponding GPU that starts execution after a small delay.

Let’s try to understand why our game doesn’t have this pattern.

First, let’s examine our DirectX* calls. The highlighted one with the tooltip is our “Present” call that sends the finished frame to the GPU. In the screenshot above, we see that it creates a “Present” packet on the GPU pipeline (marked with X’s).  At round the 2215 ms mark, it has moved closer to execution, jumping over three positions, but at 2231 ms it just disappears without completing execution.

And if we look at each present call within the trace, not one call successfully makes it to execution.

Question: How does the game draw itself if all our DirectX* Present calls are ignored?! Good thing we have good tools so we can figure this out. Let’s take a look.

Can you see something curious inside the gray oval? We can see that this packet, not caused by any DirectX* call of our code, still gets to the execution, fast and out of order. Hey, wait a minute!!!

Let's look closely at our packet. 

And now to the packet that got executed. 

Wow! It came from an EXTERNAL thread. What could this mean? External threads are threads that don’t belong to the game.

Our own packets get ignored, but an external thread draws our game? What? Hey, this tool went nuts!

No, the image is quite right. The explanation is that on the Windows* system (starting with Windows Vista*), there is a program called Desktop Window Manager (DWM), which does the actual composition on the screen. Its packets are the ones we see executing at a fast rate with high priority.  And no, our packets aren’t lost—they are intercepted by DWM to create the final picture.

But why would DWM get involved in a full- screen game? After thinking a while, I realized that the answer is simple: I have a multi-monitor desktop configuration. Switching my second monitor off the schema made the Warrior Wave behave like other games: normal GPU FPS, no glitches, and no DWM packets.

The patient will live! What a relief!

But other games still worked well even with a multi-monitor configuration, right (says the evil voice in the back of my head)?

To dig deeper, we need another tool to do that. Intel® GPA Platform Analyzer allows you to see CPU and GPU execution over time, but it doesn’t give you lower level details of each frame.

We would need to look more closely at the Direct3D* Device creation code. For this we could use Intel® GPA Frame Analyzer for DirectX*, but this is a topic for another article.

So let’s summarize what we have learned:

During this investigation we were able to detect poor usage of threads that led to FPS surges and a nasty DWM problem that was easily fixed by switching the second monitor of the desktop schema.

Conclusion: Intel® GPA Platform Analyzer is a must-have tool for initial investigation of the problem. Get familiar with it and add it to your toolbox.

About the Author:

Alexander Raud works in the Intel® Graphics Performance Analyzers team in Russia and previously worked on the VTune Amplifier. Alex has dual citizenship in Russia and the EU, speaks Russian, English, some French, and is learning Spanish.  Alex has a wife and two children and still manages to play Progressive Metal professionally and head the International Ministry at Jesus Embassy Church.


Performance Considerations for Resource Binding in Microsoft DirectX* 12

$
0
0

By Wolfgang Engel, CEO of Confetti

With the release of Windows* 10 on July 29 and the release of the 6th generation Intel® Core™ processor family (code-name Skylake), we can now look closer into resource binding specifically for Intel® platforms.

The previous article “Introduction to Resource Binding in Microsoft DirectX* 12” introduced the new resource binding methods in DirectX 12 and concluded that with all these choices, the challenge is to pick the most desirable binding mechanism for the target GPU, types of resources, and their frequency of update.

This article describes how to pick different resource binding mechanisms to run an application efficiently on specific Intel’s GPUs.

Tools of the Trade

To develop games with DirectX 12, you need the following tools:

  • Windows 10
  • Visual Studio* 2013 or higher
  • DirectX 12 SDK comes with Visual Studio
  • DirectX 12-capable GPU and drivers

Overview

A descriptor is a block of data that describes an object to the GPU, in a GPU-specific opaque format. DirectX 12 offers the following descriptors, previously named “resource views” in DirectX 11:

  • Constant buffer view (CBV)
  • Shader resource view (SRV)
  • Unordered access view (UAV)
  • Sampler view (SV)
  • Render target view (RTV)
  • Depth stencil view (DSV)
  • and others

These descriptors or resource views can be considered a structure (also called a block) that is consumed by the GPU front end. The descriptors are roughly 32–64 bytes in size and hold information like texture dimensions, format, and layout.

Descriptors are stored in a descriptor heap, which represents a sequence of structures in memory.

A descriptor table holds offsets into this descriptor heap. It maps a continuous range of descriptors to shader slots by making them available through a root signature. This root signature can also hold root constants, root descriptors, and static samplers.

Descriptors, descriptor heap, descriptor tables, root signature

Figure 1. Descriptors, descriptor heap, descriptor tables, root signature.

Figure 1 shows the relationship between descriptors, a descriptor heap, descriptor tables, and the root signature.

The code that Figure 1 describes looks like this:

// the init function sets the shader registers
// parameters: type of descriptor, num of descriptors, base shader register
// the first descriptor table entry in the root signature in
// image 1 sets shader registers t1, b1, t4, t5
// performance: order from most frequent to least frequent used
D3D12_DESCRIPTOR_RANGE Param0Ranges[3];
Param0Ranges[0].Init(D3D12_DESCRIPTOR_RANGE_SRV, 1, 1); // t1 Param0Ranges[1].Init(D3D12_DESCRIPTOR_RANGE_CBV, 1, 1); // b1 Param0Ranges[2].Init(D3D12_DESCRIPTOR_RANGE_SRV, 2, 4); // t4-t5

// the second descriptor table entry in the root signature
// in image 1 sets shader registers u0 and b2
D3D12_DESCRIPTOR_RANGE Param1Ranges[2]; Param1Ranges[0].Init(D3D12_DESCRIPTOR_RANGE_UAV, 1, 0); // u0 Param1Ranges[1].Init(D3D12_DESCRIPTOR_RANGE_CBV, 1, 2); // b2

// set the descriptor tables in the root signature
// parameters: number of descriptor ranges, descriptor ranges, visibility
// visibility to all stages allows sharing binding tables
// with all types of shaders
D3D12_ROOT_PARAMETER Param[4];
Param[0].InitAsDescriptorTable(3, Param0Ranges, D3D12_SHADER_VISIBILITY_ALL);
Param[1].InitAsDescriptorTable(2, Param1Ranges, D3D12_SHADER_VISIBILITY_ALL); // root descriptor
Param[2].InitAsShaderResourceView(1, 0); // t0
// root constants
Param[3].InitAsConstants(4, 0); // b0 (4x32-bit constants)

// writing into the command list
cmdList->SetGraphicsRootDescriptorTable(0, [srvGPUHandle]);
cmdList->SetGraphicsRootDescriptorTable(1, [uavGPUHandle]);
cmdList->SetGraphicsRootConstantBufferView(2, [srvCPUHandle]);
cmdList->SetGraphicsRoot32BitConstants(3, {1,3,3,7}, 0, 4);

The source code above sets up a root signature that has two descriptor tables, one root descriptor, and one root constant. The code also shows that root constants have no indirection and are directly provided with the SetGraphicsRoot32bitConstants call. They are routed directly into the shader registers; there is no actual constant buffer, constant buffer descriptor, or binding happening. Root descriptors have only one level of indirection, because they store a pointer to memory (descriptor->memory), and descriptor tables have two levels of indirection (descriptor table -> descriptor-> memory).

Descriptors live in different heaps depending on their types, such as SV and CBV/SRV/UAV. This is due to wildly inconsistent sizes of descriptor types on different hardware platforms. For each type of descriptor heap, there should be only one heap allocated because changing heaps could be expensive.

In general DirectX 12 offers an allocation of more than one million descriptors upfront, enough for a whole game level. While previous DirectX versions dealt with allocations in the driver on their own terms, with DirectX 12 it is possible to avoid any allocations during runtime. That means any initial allocation of a descriptor can be taken out of the performance “equation.”

Note: With 3rd generation Intel® Core™ processors (code-name Ivy Bridge)/4th generation Intel® Core™ processor family (code-name Haswell) and DirectX 11 and the Windows Display Driver Model (WDDM) version 1.x, resources were dynamically mapped into memory based on the resources referenced in the command buffer with a page table mapping operation. This way copying data was avoided. The dynamic mapping was important because those architectures only offer 2 GB of memory to the GPU (Intel® Xeon® processor E3-1200 v4 product family (code-name Broadwell) offers more).
With DirectX 12 and WDDM version 2.x, it is no longer possible to remap resources into the GPU virtual address space as necessary, because resources have to be assigned a static virtual address when created and therefore the virtual address of resources cannot change after creation. Even if a resource is “evicted” from GPU memory, it maintains its virtual address for later when it is made resident again.
Therefore the overall available memory of 2 GB in Ivy Bridge/Haswell can become a limiting factor.

As stated in the previous article, a perfectly reasonable outcome for an application might be a combination of all types of bindings: root constants, root descriptors, descriptor tables for descriptors gathered on-the-fly as draw calls are issued, and dynamic indexing of large descriptor tables.

Different hardware architectures will show different performance trade-offs between using sets of root constants and root descriptors versus using descriptor tables. Therefore it might be necessary to tune the ratio between root parameters and descriptor tables depending on the hardware target platforms.

Expected Patterns of Change

To understand which kinds of change incur an additional cost, we have to analyze first how game engines typically change data, descriptors, descriptor tables, and root signatures.

Let’s start with what is called constant data. Most game engines store usually all constant data in “system memory.” The game engine will change data in CPU accessible memory and then later on during the frame, a whole block of constant data is copied/mapped into GPU memory and then read by the GPU through a constant buffer view or through the root descriptor.

If the constant data is provided through SetGraphicsRoot32BitConstants() as a root constant, the entry in the root descriptor does not change but the data might change. If it is provided through a CBV == descriptor and then a descriptor table, the descriptor doesn’t change but the data might change.

In case we need several constant buffer views—for example, for double or triple buffered rendering— the CBV or descriptor might change for each frame in the root signature.

For texture data, it is expected that the texture is allocated in GPU memory during startup. Then an SV == descriptor will be created, stored in a descriptor table or a static sampler, and then referenced in the root descriptor. The data and the descriptor or static sample do not change after that.

For dynamic data like changing texture or buffer data (for example, textures with rendered localized text, buffers of animated vertices or procedurally generated meshes), we allocate a render target or buffer, provide an RTV or UAV, which are descriptors, and then these descriptors might not change from there on. The data in the render target or buffer might change.

In case we need several render targets or buffers—for example, for double or triple buffered rendering—the descriptors might change for each frame in the root signature.

For the following discussion, a change is considered important for binding resources if it does the following:

  • Changes/replaces a descriptor in a descriptor table, for example, the CBVs, RTVs, or UAVs described above
  • Changes any entry in the root signature

Descriptors in Descriptor Tables with Haswell/Broadwell

On platforms based on Haswell/Broadwell, the cost of changing one descriptor table in the root signature is equivalent to changing all descriptor tables. Changing one argument means that the hardware has to make a copy (version) of all the current arguments. The number of root parameters in a root signature is the amount of data that the hardware has to version when any subset changes.

Note: All the other types of memory in DirectX 12, like descriptor heaps, buffer resources, and so on, are not versioned by hardware.

In other words, changing all of the parameters is roughly the same cost as just changing one (see [Lauritzen] and [MSDN]). Changing none is still the cheapest, but not that useful.

Note: Other hardware, that has for example a split between fast / slow (spill) root argument storage only has to version the region of memory where the argument changed – either the fast area or the spill area.

On Haswell/Broadwell, an additional cost of changing descriptor tables can come from the limited size of the binding table in hardware.

Descriptor tables on those hardware platforms use “binding table” hardware. Each binding table entry is a single DWORD that can be considered an offset into the descriptor heap. The 64 KB ring can store 16,384 binding table entries.

In other words the amount of memory consumed per draw call is dependent on the total number of descriptors that are indexed in a descriptor table and then referenced through a root signature.

In case we run out of the 64 KB memory for the binding table entries, the driver will allocate another 64 KB binding table. The switch between those tables leads to a pipeline stall as shown in Figure 2.

Pipeline stall (courtesy of Andrew Lauritzen)

Figure 2. Pipeline stall (courtesy of Andrew Lauritzen).

For example a root signature references 64 descriptors in a descriptor table. The stall will happen every 16,384 / 64 = 256 draw calls.

Because changing a root signature is considered cheap, having multiple root signatures with a low number of descriptors in the descriptor table is favorable over having root signatures with a larger amount of descriptors in the descriptor table.

Therefore it is favorable on Haswell/Broadwell to keep the number of descriptors referenced in descriptor tables as low as possible.

What does this mean for renderer designs? Using more descriptor tables with less descriptors and therefore more root signatures should increase the number of pipeline state objects (PSO), because with an increased number of root signatures the number of PSOs needs to increase because of the one-to-one relationship between these two.

Having more pipeline state objects might lead to a larger number of shaders that, in this case, might be more specialized, instead of longer shaders that offer a wider range of features, which is the common recommendation.
 

Root Constants/Descriptors on Haswell/Broadwell

Similar to where changing one descriptor table is the same cost compared to changing all of them, changing one root constant or root descriptor is the equivalent to changing all of them (see [Lauritzen]).

Root constants are implemented with “push constants” that are a buffer that hardware uses to prepopulate Execution Unit (EU) registers. Because the values are immediately available when the EU thread launches, it can be a performance win to store constant data as root constants, instead of storing them with descriptor tables.

Root descriptors are implemented as “push constants” as well. They are just pointers passed as constants to the shader, reading data through the general memory path.

Descriptor Tables versus Root Constants/Descriptors on Haswell/Broadwell

Now that we looked at the way descriptor tables, root constants, and descriptors are implemented, we can answer the main question of this article: is one favorable over the other? Because of the limited size of binding table hardware and the potential stalls resulting from crossing this limit, changing root constants and root descriptors is expected to be cheaper on Haswell/Broadwell hardware because they do not use the binding table hardware. For root descriptors and root constants, this is especially recommended in case the data changes every draw call.

Static Samplers on Haswell/Broadwell

As described in the previous article, it is possible to define samplers in the root signature or right in the shader with HLSL root signature language. These are called static samplers.

On Haswell/Broadwell hardware, the driver will place static samplers in the regular sampler heap. This is equivalent to putting them into descriptors manually. Other hardware implements samplers in shader registers, so static samplers can be compiled directly into the shader.

In general static samplers should be a win on many platforms, so there is no downside to using them. On Haswell/Broadwell hardware there is still the chance that by increasing the number of descriptors in a descriptor table, we end up more often with a pipeline stall, because descriptor table hardware has only 16,384 slots to offer.

Here is the syntax for a static sampler in HLSL:

StaticSampler( sReg,
               [ filter = FILTER_ANISOTROPIC,
               addressU = TEXTURE_ADDRESS_WRAP,
               addressV = TEXTURE_ADDRESS_WRAP,
               addressW = TEXTURE_ADDRESS_WRAP,
               mipLODBias = 0.f,     maxAnisotropy = 16,
               comparisonFunc = COMPARISON_LESS_EQUAL,
               borderColor = STATIC_BORDER_COLOR_OPAQUE_WHITE,
               minLOD = 0.f, maxLOD = 3.402823466e+38f,
               space = 0, visibility = SHADER_VISIBILITY_ALL ])

Most of the parameters are self-explanatory because they are similar to the C++ level usage. The main difference is the border color: on the C++ level it offers a full color range while the HLSL level is restricted to opaque white/black and transparent black. An example for a static shader is:

StaticSampler(s4, filter=FILTER_MIN_MAG_MIP_LINEAR)

Skylake

Skylake allows dynamic indexing of the entire descriptor heap (~1 million resources) in one descriptor table. That means one descriptor table could be enough to index all the available descriptor heap memory.

Compared to previous architectures, it is not necessary to change descriptor table entries in the root signature as often. That also means that the number of root signatures can be reduced. Obviously different materials will require different shaders and therefore different PSOs. But those PSOs can reference the same root signatures.

With modern rendering engines utilizing less shaders than their DirectX 9 and 11 ancestors so that they can avoid the cost of changing shaders and the attached states, reducing the number of root signatures and therefore the number of PSOs is favorable and should result in a performance gain on any hardware platform.

Conclusion

Focusing on Haswell/Broadwell and Skylake, the recommendation for developing performant DirectX 12 applications are dependent on the underlying platform. While for Haswell/Broadwell, the number of descriptors in a descriptor table should be kept low, for Skylake it is recommended to keep this number high and decrease the number of descriptor tables.

To achieve optimal performance, the application programmer can check during startup for the type of hardware and then pick the most efficient resource binding pattern. (There is a GPU detect example that shows how to detect different Intel® hardware architectures at https://software.intel.com/en-us/articles/gpu-detect-sample/) The choice of resource binding pattern will influence how shaders for the system are written.

About the Author

Wolfgang is the CEO of Confetti. Confetti is a think-tank for advanced real-time graphics research and a service provider for the video game and movie industry. Before cofounding Confetti, Wolfgang worked as the lead graphics programmer in Rockstar's core technology group RAGE for more than four years. He is the founder and editor of the ShaderX and GPU Pro books series, a Microsoft MVP, the author of several books and articles on real-time rendering and a regular contributor to websites and conferences worldwide. One of the books he edited, ShaderX4, won the Game developer Front line award in 2006. Wolfgang is in many advisory boards throughout the industry; one of them is the Microsoft’s Graphics Advisory Board for DirectX 12. He is an active contributor to several future standards that drive the Game Industry. You can find him on twitter at wolfgangengel. Confetti's website is  www.conffx.com

Acknowledgement

I would like to thank the reviewers of this article:

  • Andrew Lauritzen
  • Robin Green
  • Michal Valient
  • Dean Calver
  • Juul Joosten
  • Michal Drobot

References and Related Links

** Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark* and MobileMark*, are measured using specific computer systems, components, software, operations, and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.

How Intugine Integrated the Nimble* Gesture Recognition Platform with Intel® RealSense™ Technology

$
0
0

Shwetha Doss, Senior Application Engineer, Intel Corporation

Harshit Shrivastava, Founder and CEO, Intugine Technologies

Abstract

Intel® RealSense™ technology helps developers enable a natural user interface (NUI) for their gesture recognition platforms. The gesture recognition platform seamlessly integrates with Intel RealSense technology for NUI across segments of applications on Microsoft Windows* platforms. The gesture recognition platform handles all interactions with the user and the Intel® RealSense™ SDK, ensuring that no code changes are required for individual applications.

This paper highlights how Intugine (http://www.intugine.com/) enabled its gesture recognition platforms for Intel® RealSense™ technology. It also discusses how the same methodology can be applied to other applications related to games and productivity applications.

Introduction

Intel® RealSense™ technology adds “human-like” senses to computing devices. Intel® is working with OEMs to create future computing devices that will be able to hear, see, and feel the environment, as well as understand human emotion and a human’s sensitivity to context. These devices will interact with humans in immersive, natural, and intuitive ways.

Intel® RealSense™ technology understands four important modes of communication: hands, the face, speech, and the environment around you. This multi-modal processing will enable the devices to behave more like humans.

The Intel® RealSense™ Camera

The Intel® RealSense™ camera uses depth-sensing technology so that computing devices see more like you do. To harness the possibilities of the Intel® RealSense™ technology, developers need to use the Intel® RealSense™ SDK along with the Intel® RealSense™ camera. There are two camera options: theF200 and the R200. These Intel-developed depth cameras support full VGA depth resolution, full 1080p RGB resolution, and require USB 3.0. Both cameras support depth and IR processing at 640×480 resolution at 60 frames per second (FPS).

There are many OEM devices with integrated Intel® RealSense™ cameras available, including Ultrabooks*, tablets, notebooks, 2 in1s, and all-in-one form factors.

Gesture Recognition Platform

Figure 1. Intel® RealSense™ cameras.

The Intel® RealSense™ camera (F200)

Figure 2. The Intel® RealSense™ camera (F200).

The infrared (IR) laser projector on the Intel RealSense camera (F200) sends non-visible patterns (coded light) onto the object. The IR camera captures the reflected patterns. These patterns are processed by the ASIC, which assigns depth values to each pixel to create a depth video frame.

Applications see both depth and color video streams. The ASIC syncs depth with color stream (texture mapping) using a UVC time stamp and generates data flags for each depth value (valid, invalid, or motion detected.) The range of the F200 camera is about 120 cm.

The Intel® RealSense™ camera (R200)

Figure 3. The Intel® RealSense™ camera (R200).

The R200 camera actually has three cameras providing RGB (color) and stereoscopic IR to produce depth. With the help of a laser projector, the camera does 3D scanning for scene perception and enhanced photography. The inside range is approximately 0.5–3.5 meters, and the outside range is up to 10 meters.

Intel® RealSense™ SDK

The Intel® RealSense™ SDK includes a set of pattern detection and recognition algorithm implementations exposed through standardized interfaces. These algorithms implementations enable the application developer’s focus to move from coding the algorithm details to innovating on the usage of these algorithms.

Intel® RealSense™ SDK Architecture

The SDK library architecture consists of several components. The essence of the SDK functionalities lays in the I/O modules and the algorithm modules. The I/O modules retrieve input from the input device or send output to an output device.

The algorithm module includes various pattern detection and recognition algorithms related to face recognition, gesture recognition, and speech recognition.

The Intel® RealSense™ SDK architecture

Figure 4. The Intel® RealSense™ SDK architecture.

The Intel® RealSense™ SDK

Figure 5. The Intel® RealSense™ SDK provides 78-point face landmarks.

The Intel® RealSense™ SDK provides skeletal tracking

Figure 6. The Intel® RealSense™ SDK provides skeletal tracking.

Intugine Nimble*

Intugine Nimble* is a high-accuracy, motion-sensing wearable device. The setup consists of a USB sensor and two wearable devices: a ring and a finger clip. The sensor tracks the movement of rings in 3D space with sub-millimeter accuracy and low latency. The device works on computer vision, where the rings do a certain patterned emission in a narrow nanometer bandwidth, and the sensor is coupled to see only that wavelength. The software algorithm sitting on the host device recognizes the emitted pattern and tracks the rings individually. The software generates the coordinates of the rings at a high frame rate of over 60 coordinates per second, for each ring.

The Intugine Nimble

Figure 7. The Intugine Nimble* effectively replaces the mouse and keyboard.

I.

Applications With Nimble

Some of the available applications that Nimble can control are games such as Fruit Ninja*, Angry Birds*, and Counter-Strike* and utility applications such as Microsoft PowerPoint* and media players. These available applications are currently controlled by mouse and keyboard inputs. To control them with Nimble, we need to generate the keyboard and mouse events programmatically.

The software module that takes care of the keyboard and mouse events is called the interaction layer. Nimble uses a proprietary software interaction layer to interact with existing games and applications. The interaction layer maps the user’s fingertip coordinates to the application/OS recognizable mouse and keyboard events.

Nimble with the Intel® RealSense™ SDK

The Intel® RealSense™ SDK can detect IR emissions of 860 nm. The patterned emission of Nimble rings can be customized to a certain wavelength range. Replacing the emission source in the ring by an 860 nm emitter, the ring emits similar patterns in the 860 nm range. The Intel® RealSense™ SDK can sense these emissions, which can be taken as an image stream and then tracked using the SDK. By implementing Nimble pattern recognition and tracking algorithms in the Intel® RealSense™ SDK, we get the coordinates of individual rings at 60 FPS.

Intel® RealSense™ SDK’s design avoids most of lens and curvature defects, which allows a better scaled motion tracking of Nimble rings. The IR resolution of 640×480 generates refined spatial coordinate information. The Intel® RealSense™ SDK supports up to 300 FPS in the IR stream, which provides almost zero latency in Nimble’s tracking and provides an extremely responsive experience.

Nimble technology is designed to track only the emissions of rings and thus misses the details of skeletal tracking that might be required for a few applications.

The Intugine Nimble

Figure 8. The Intugine Nimble* along with Intel® RealSense™ technology.

Value proposition for Intel® RealSense™ Technology

Nimble along with Intel® RealSense™ technology can support a wide range of existing applications. Currently over 100 applications are working seamlessly without needing any source-code modifications. And potentially most of the Microsoft* Windows and Android* applications can work with this solution.

Currently the Intel® RealSense™ camera (F200) supports a range of 120 cm. With the addition of Nimble, this range can extend to over 15 feet.

Nimble allows sub-millimeter accurate finger tracking within a range of 3 feet and sub-centimeter accurate tracking within a range of 15 feet. This enables many high-accuracy games and applications to be used with better control.

Nimble along with Intel® RealSense™ technology reduces the application latency to less than 5 milliseconds.

Nimble along with Intel® RealSense™ technology can support multiple rings together; we have tested up to eight rings with Intel® RealSense™ technology.

Summary

Nimble’s interaction layer along with Intel® RealSense™ technology can help add gesture support to any application without any changes to the source code. Using this technology, applications in Windows* and Android* platforms can add gesture support with minimal efforts.

For More Information

  1. Intel® RealSense™ technology: http://www.intel.in/content/www/in/en/architecture-and-technology/realsense-overview.html
  2. Intugine: http://www.intugine.com/
  3. https://software.intel.com/en-us/articles/realsense-r200-camera

Intel® C++ Composer XE 2013 SP1 for Windows*, Update 6

$
0
0

Intel® C++ Composer XE 2013 SP1 Update 6 includes the latest Intel C/C++ compilers and performance libraries for IA-32 and Intel® 64 architecture systems. This new product release now includes: Intel® C++ Compiler XE Version 14.0.6, Intel® Math Kernel Library (Intel® MKL) Version 11.1 Update 4, Intel® Integrated Performance Primitives (Intel® IPP) Version 8.1 Update 1, Intel® Threading Building Blocks (Intel® TBB) Version 4.2 Update 5, Intel(R) Debugger Extension 7.5-1.0 for Intel(R) Many Integrated Core Architecture.

New in this release:

Note:  For more information on the changes listed above, please read the individual component release notes. See the previous releases's ReadMe to see what was new in that release.

Resources

Contents
File: w_ccompxe_online_2013_sp1.6.241.exe
Online installer

File: w_ccompxe_2013_sp1.6.241.exe
Product for developing 32-bit and 64-bit applications

File:  w_ccompxe_redist_msi_2013_sp1.6.241.zip
Redistributable Libraries for 32-bit and 64-bit msi files

File:  get-ipp-8.1-crypto-library.htm
Cryptography Library

Intel® Visual Fortran Composer XE 2013 SP1 for Windows* with Microsoft Visual Studio 2010 Shell & Libraries*, Update 6

$
0
0

Intel® Visual Fortran Composer XE 2013 SP1 Update 6 includes the latest Intel Fortran compilers and performance libraries for IA-32 and Intel® 64 architecture systems. This new product release now includes: Intel® Visual Fortran Compiler XE Version 14.0.6, Intel® Math Kernel Library (Intel® MKL) Version 11.1 Update 4, Intel® Debugger Extension 7.5-1.0 for Intel® Many Integrated Core Architecture (Intel® MIC Architecture)

New in this release:

Note:  For more information on the changes listed above, please read the individual component release notes. See the previous releases's ReadMe to see what was new in that release.

Resources

Contents
File:  w_fcompxe_online_2013_SP1.6.241.exe
Online installer

File:  w_fcompxe_2013_sp1.6.241.exe
Product for developing 32-bit and 64-bit applications (with Microsoft Visual Studio 2010 Shell & Libraries*, English version)

File:  w_fcompxe_all_jp_2013_sp1.6.241.exe
Product for developing 32-bit and 64-bit applications (with Microsoft Visual Studio 2010 Shell & Libraries*, Japanese version)

File:  w_fcompxe_redist_msi_2013_sp1.6.241.zip 
Redistributable Libraries for 32-bit and 64-bit msi files

Intel® Visual Fortran Composer XE 2013 SP1 for Windows* with IMSL*, Update 6

$
0
0

Intel® Visual Fortran Composer XE 2013 SP1 Update 6 includes the latest Intel Fortran compilers and performance libraries for IA-32 and Intel® 64 architecture systems. This new product release now includes: Intel® Visual Fortran Compiler XE Version 14.0.6, Intel® Math Kernel Library (Intel® MKL) Version 11.1 Update 4, Intel® Debugger Extension 7.5-1.0 for Intel® Many Integrated Core Architecture (Intel® MIC Architecture), IMSL* Fortran Numerical Library Version 7.0.1

New in this release:

Note:  For more information on the changes listed above, please read the individual component release notes. See the previous releases's ReadMe to see what was new in that release.

Resources

Contents
File:  w_fcompxe_online_2013_sp1.6.241.exe
Online installer

File:  w_fcompxe_2013_sp1.6.241.exe
Product for developing 32-bit and 64-bit applications (with Microsoft Visual Studio 2010 Shell & Libraries*, English version)

File:  w_fcompxe_all_jp_2013_sp1.6.241.exe
Product for developing 32-bit and 64-bit applications (with Microsoft Visual Studio 2010 Shell & Libraries*, Japanese version)

File:  w_fcompxe_redist_msi_2013_sp1.6.241.zip 
Redistributable Libraries for 32-bit and 64-bit msi files

File:  w_fcompxe_imsl_2013_sp1.0.024.exe 
IMSL* Library for developing 32-bit and 64-bit applications

3D People Full-Body Scanning System With Intel® RealSense™ 3D Cameras and Intel® Edison: How We Did It

$
0
0

By Konstantin Popov of Cappasity

Cappasity has been developing 3D scanning technologies for two years. This year we are going to release a scanning software product for Ultrabook™ devices and tablets with Intel® RealSense™ cameras: Cappasity Easy 3D Scan*. Next year we plan to create hardware and software solutions to scan people and objects. 
 
As an Intel® Software Innovator and with the help of the Intel® team, we were invited to show the prototype of the people scanning system much earlier than planned. We had limited time for preparations, but still we decided to take on the challenge. In this article I'll explain how we created our demo for the Intel® Developer Forum 2015 held August 18– 20 in San Francisco.

Cappasity instant 3D body scan

Our demo is based upon previously developed technology that combines the multiple depth cameras and the RGB cameras into a single scanning system (U.S. Patent Pending). The general concept is as follows: we calibrate the positions, angles, and optical properties of the cameras. This calibration allows us to merge the data for subsequent reconstruction of the 3D model. To capture the scene in 3D we can place the cameras around the scene, rotate the camera system around the scene, or rotate the scene itself in front of the cameras.
 
We selected the Intel® RealSense™ camera because we believe that it's an optimum value-for-money solution for our B2B projects. At present we are developing two prototype systems using several Intel® RealSense™ cameras: a scanning box with several 3D cameras for instant scanning and a system for full-body people scanning.
 
We demonstrated both prototypes at IDF 2015. The people scanning prototype operated with great success for the three days of the conference, scanning many visitors who came to our booth.

A system for full-body people scanning

Now let's see how it works. We attached three Intel® RealSense™ cameras to a vertical bar so that the bottom camera is aimed at the feet and lower legs, the middle camera captures the legs and the body, and the top-most camera films the head and the shoulders.

Three Intel RealSense cameras attached to a vertical bar

Each camera is connected to a separate Intel® NUC computer, and all the computers are connected to the local area network.
 
Since the cameras are mounted onto a fixed bar, we used a rotating table to rotate the person being filmed. The table construction is quite basic: a PLEXIGLAS* pad, roller bearings, and a step motor. The table is connected to the PC via an Intel® Edison board; it receives commands through the USB port.

The table is connected to the PC via an Intel® Edison board

a simple lighting system to steadily illuminate the front

We also used a simple lighting system to steadily illuminate the front of a person being filmed. In the future, all these components will be built into a single box, but at present we were just demonstrating an early prototype of the scanning system, so we had to assemble everything using a commercially available component.

Cappasity fullbody scan

Our software operates based on the client-server architecture, but the server part can be run on almost any modern PC. That is, any computer that performs our calculations is a "server" in our system. We often use an ordinary Ultrabook® with Intel® HD Graphics as a server. The server sends the recording command to the Intel® NUC computers, gets the data from them, then analyzes and rebuilds the 3D model. 
 
Now, let's look at some particular aspects of the task we are trying to solve. The 3D rebuilding technology that we use in the Cappasity products is based upon our implementation of the Kinect* Fusion algorithm. But in this case our challenge was much more complex: we had only one month to create an algorithm to reconstruct the data from several sources. We called it "Multi-Fusion." In its present state the algorithm can merge the data from an unlimited number of sources into a single voxel volume. For scanning people three data sources were enough.
 
Calibration is the first stage. The Cappasity software allows the devices to be calibrated pairwise. Our studies from the year we spent in R&D came in pretty handy in preparation for IDF 2015. In just a couple of weeks we reworked the calibration procedure and implemented support for voxel volumes after Fusion. Previously the calibration process was more involved with processing the point cloud. The system needs to be calibrated just once, after the cameras are installed. Calibration takes no more than 5 minutes.
 
Then we had to come up with a data-processing approach, and after doing some research we chose post-processing. That is, first we record the data from all cameras, then we upload the data to the server via the network, and then we begin the reconstruction process. All cameras record color and depth streams. As a result, we have the complete data cast for further processing. It is convenient considering that the post-processing algorithms are constantly improved, and the ones we're using were written in just a couple of days before IDF.
 
Compared to the Intel® RealSense™ camera (F200), the Intel® RealSense™ camera (long-range R200) performs better with black color and complex materials. We had few glitches in tracking. The most important thing, however, is that the cameras allow us to capture the images at the required range. We have optimized the Fusion reconstruction algorithm for OpenCL* to achieve good performance even on Intel® HD Graphics 5500 and later. To remove the noise we used Fusion plus additional data segmentation after a single mesh was composed.

Fusion plus additional data segmentation after a single mesh was composed

High resolution texture mapping algorithm

In addition, we have refined the high-resolution texture mapping algorithm. We use the following approach: we capture the image at the full resolution of the color camera, and then we project the image onto the mesh. We are not using voxel color since it causes the texture quality to degrade. The projection method is quite complex to implement, but it allows us to use both built-in and external cameras as color sources. For example, the scanning box we are developing operates using DSLR cameras to get high-resolution textures, which is important for our e-commerce customers.
 
However, even the built-in Intel® RealSense™ cameras with RGB provide perfect colors. Here is a sample after mapping the textures:

Sample after mapping the textures

We are developing a new algorithm to eradicate the texture shifting. We plan to have it ready by the release of our Easy 3D Scan software product. 
 
Our seemingly simple demo is based upon complex code allowing us to compete with expensive scanning systems at USD 100K+ price range. The Intel® RealSense™ cameras are budget-friendly, which will help them revolutionize the B2B market.
 
Here are the advantages of our people scanning system:

  • It is an affordable solution, and it’s easy to setup and operate. Only a press of a button is needed.
  • Small size: the scanning system can be placed in retail areas, recreational centers, medical institutions, casinos, and so on.
  • The quality of the 3D models is suitable for 3D printing and for developing content for AR/VR applications.
  • The precision of the resulting 3D mesh is suitable for taking measurements.

 
We understand that the full potential of the Intel® RealSense™ cameras is yet to be uncovered. We are confident that at CES 2016 we'll be able to demonstrate significantly improved products.

Blend the Intel® RealSense™ Camera and the Intel® Edison Board with JavaScript*

$
0
0

Introduction

Smart devices can now connect to things we never before thought possible. This is being enabled by the Internet of Things (IoT), which allows these devices to collect and exchange data.

Intel has created Intel® RealSense™ technology, which includes the Intel® RealSense™ camera and the Intel® RealSense™ SDK. Using this technology, you can create applications that detect gesture and head movement, analyze facial data, perform background segmentation, read depth level, recognize and synthesize voice and more. Imagine that you are developing a super sensor that can detect many things. Combined with the versatile uses of the Intel® Edison kit and its output, you can build creative projects that are both useful and entertaining.

The Intel® RealSense™ SDK provides support to popular programming language and frameworks such as C++, C#, Java*, JavaScript*, Processing, and Unity*. This means that developers can get started quickly using a programming environment they are familiar with.

Peter Ma’s article, Using an Intel® RealSense™ 3D Camera with the Intel® Edison Development Platform, presents two examples of applications using C#. The first uses the Intel® RealSense™ camera as input and the Intel® Edison board as output. The result is that if you spread your fingers in front of Intel® RealSense™ camera, it sends a signal to the Intel® Edison board to turn on the light.

In the second example, Ma reverses the flow, with the Intel® Edison board as input and the Intel® RealSense™ camera as output. The Intel® Edison board provides data that comes from a sensor to be processed and presents it to us through the Intel® RealSense™ camera as voice synthesis to provide more humanized data.

Ma’s project inspired me to build something similar, but using JavaScript* instead of C#. I used the Intel® RealSense™ SDK to read and send hand gesture data to a node.js server, which then sends the data to the Intel® Edison board to trigger a buzzer and LED that are connected to it.

About the Project

This project is written in JavaScript*. If you are interested in implementing only a basic gesture, the algorithm module is already in the Intel® RealSense™ SDK. It gives you everything you need.

Hardware

Requirements:

Intel® Edison board with the Arduino breakout board

The Intel® Edison board is a low-cost, general-purpose computer platform. It uses a 22nm dual-core Intel® Atom™ SoC running at 500 MHz. It supports 40 GPIOs and includes 1 GB LPDDR3 RAM, 4 GB EMMC for storage, dual-band Wi-Fi, and Bluetooth, and has a small size.

The board runs the Linux* kernel and is compatible with Arduino, so it can run an Arduino implementation as a Linux* program.

 


Figure 1. Intel® Edison breakout board kit.

Grove Starter Kit Plus - Intel® XDK IoT Edition

Grove Starter Kit Plus - Intel® XDK IoT Edition is designed for the Intel® Galileo board Gen 2, but it is fully compatible with the Intel® Edison board via the breakout board kit.

The kit contains sensors, actuators, and shields, such as a touch sensor, light sensor, and sound sensor, and also contains an LCD display as shown in Figure 2. This kit is an affordable solution for developing an IoT project.

You can purchase the Grove Starter Kit Plus here: 


Figure 2. Grove* Starter Kit Plus - Intel® XDK IoT Edition

Intel® RealSense™ Camera

The Intel® RealSense™ camera is built for game interactions, entertainment, photography, and content creation with a system-integrated or a peripheral version. The camera’s minimum requirements are a USB 3.0 port, a 4th gen Intel Core processor, and 8 GB of hard drive space.

The camera (shown in Figure 3) features full 1080p color and an in-depth sensor and gives the PC a 3D visual and immersive experience.


Figure 3. Intel® RealSense™ camera

You can purchase the complete developer kit, which includes the camera here.

GNU/Linux* server

A GNU/Linux* server is easy to develop. You can use an old computer or laptop or you can put the server on a cloud. I used a cloud server with an Ubuntu* server. If you have different Linux* flavors for the server, just adapt to your favorite command.

Software

Before we start to develop the project, make sure you have the following software installed on your system. You can use the links to download the software.

Set Up the Intel® RealSense™ Camera

To set up the Intel® RealSense™ camera, connect the Intel® RealSense™ camera (F200) to the USB 3.0 port, and then install the driver as the camera connected to your computer. Navigate to the Intel® RealSense™ SDK location, and open the JavaScript* sample on your browser:

Install_Location\RSSDK\framework\JavaScript\FF_HandsViewer\FF_HandsViewer.html

After the file opens, the scripts checks to see what platform you have. While the script is checking your platform, click the link on your web browser to install the Intel® RealSense™ SDK WebApp Runtime.

When the installation is finished, restart your web browser, and then open the file again. You can check to see that the installation was a success by raising your hand in front of the camera. It should show your hand gesture data visualized on your web browser.

Gesture Set Up

The first key code line that enables gesture looks like the following: 

{"timeStamp":130840014702794340 ,"handId": 4,"state": 0,"frameNumber":1986 ,"name":"spreadfinger"
}

This sends "name":"spreadfingers" to the server to be processed.

Next, we will write some JavaScript* code to stream gesture data from the Intel® RealSense™ camera to the Intel® Edison board through the node.js server.

Working with JavaScript*

Finally, we get to do some programming. I suggest that you first move the whole folder because the default installation doesn’t allow the original folder to be rewritten.

Copy the FF_HandsViewer folder from this location and paste it somewhere else. The folder’s location is:

\install_Location\RSSDK\framework\JavaScript\FF_HandsViewer\

Eventually, you will be able to create your own project folder to keep things managed.

Next, copy the realsense.js file from the location below and paste it inside the FF_HandsViewer folder:

Install_Location\RSSDK\framework\common\JavaScript

To make everything easier, let’s create one file named edisonconnect.js. This file will receive gesture data from the Intel® RealSense™ camera and send it to the node.js server. Remember that you have to change the IP address on the socket variable directing it to your node.js server IP address:

// var socket = io ('change this to IP node.js server');

var socket = io('http://192.168.1.9:1337');

function edisonconnect(data){
  console.log(date.name);
  socket.emit('realsense_signal',data);
}

Now for the most important step: commanding the file sample.js to create the gesture data, and running a thread to intercept that gesture data and pass it to edisonconnect.js. You don’t need to watch CPU activity because it doesn’t take much frame rate or RAM as it compiles.

// retrieve the fired gestures
for (g = 0; g < data.firedGestureData.length; g++){
  $('#gestures_status').text('Gesture: ' + JSON.stringify(data.firedGestureData[g]));

  // add script start - passing gesture data to edisonconnect.js
	edisonconnect(data.firedGestureData[g]);
  // add script end
}

After the function above is running and called to create some gesture data, the code below finishes the main task of the JavaScript* program.  After that, you have to replace the realsense.js file path.

It is critical to do the following: link the socket.io and edisonconnect.js files

<!DOCTYPE html><html><head><title> Intel&reg; RealSense&trade; SDK JavaScript* Sample</title><script src=”https://aubahn.s3.amazonaws.com/autobahnjs/latest/autobahn.min.jgz” </script><script src=”https://promisejs.org/polyfills/promise-6.1.0.js” </script><script src=”https://ajax.googleapis.com/ajax/libs/jquery/1.7.2/jquery.min.js” </script><script src=”https://common/JavaScript/realsense.js” </script><script src=”sample.js” </script><script src=”three.js” </script><!-- add script start --><script src=”https://cdn.socket.io/socket.io-1.3.5.js” </script><script src=”edisonconnect.js” </script><!-- add script end → <link rel=”stylesheet” type=”text/css” href=”style.css”></head><body>

The code is taken from SDK sample. It has been reduced in order to make the code simple and easy. The code is about to send gesture data to the server. The result is that the Intel® RealSense™ SDK was successful in understanding gesture and is ready to send it to the server.

Set Up the Server

We will use a GNU/Linux*-based server. I use an Ubuntu* server as the OS, but you can use any GNU/Linux* distribution that you familiar with. We will skip the installation server section, because related tutorials are readily found on the Internet.

Log in as a root user through SSH to configure the server.

As the server has just been installed, we need to update the repository list and upgrade the server. To do this, I will use a common command that is found on Ubuntu distribution but you can use a similar command depending on the GNU/Linux* distribution that you are using.

# apt-get update && apt-get upgrade

Once the repository list is updated, the next step is to install node.js.

# apt-get install nodejs

We also need to install npm Package Manager.

# apt-get install npm

Finally, install socket.io express from npm Package Manager.

# npm install socket.io express

Remember to create file server.js and index.html.

# touch server.js index.html

Edit the server.js file, using your favorite text editor such as vim or nano #

vim server.js

Write down this code:

var express   = require("express");
var app   	= express();
var port  	= 1337;

app.use(express.static(__dirname + '/'));
var io = require('socket.io').listen(app.listen(port));
console.log("Listening on port " + port);

io.on('connection', function(socket){'use strict';
  console.log('a user connected from ' + socket.request.connection.remoteAddress);

	// Check realsense signal
	socket.on('realsense_signal', function(data){
  	socket.broadcast.emit('realsense_signal',data);
  	console.log('Hand Signal: ' + data.name);
	});
  socket.on('disconnect',function(){
	console.log('user disconnected');
  });
});

var port = 1337; means that an available port has been assigned to port 1337. console.log("Listening on port " + port) ; indicates whether the data from JavaScript* has been received or not. The main code is socket.broadcast.emit('realsense_signal',data); this means the data is received and is ready to broadcast to all listening port and clients.

The last thing we need to do is to run the server.js file with node. If listening at port 1337 displays as shown in the screenshot below, you have been successful.
# node server.js

root@edison:~# node server.js
Listening on port 1337
events.js:85

Set up the Intel® Edison Board

The Intel® Edison SDK is easy to deploy. Refer to the following documentation:

Now it's time to put the code into the Intel® Edison board. This code connects the server and listens to any broadcast that comes from the server. It is like the code for the other server and listening step. If any gesture data is received, the Intel® Edison board triggers Digital Pins to On/Off.

Open the Intel® XDK IoT Edition and create a new project from Templates, using the DigitalWrite template, as shown in the screenshot below.

Edit line 9 in package.json. by adding dependencies socket.io-client. If it is empty,  search to find the proper installation. By adding dependencies, it will install the socket io client, if there was no client in the Intel® Edison board.

"dependencies": {"socket.io-client":"latest" // add this script
}

Find the file named main.js. You need to connect to the server in order to make sure that server is ready to listen. Next, check to see whether the gesture data name "spreadfingers" exists in that file, which will trigger Digital Pins2 and Digital Pins8 state to 1 / On and reversed.
Change the referring server IP’s addresses. If you want to change the pins, make sure you change on mraa.Gpio(selectedpins) too.

var mraa  = require("mraa");

var pins2 = new mraa.Gpio(2);
	pins2.dir(mraa.DIR_OUT);

var pins8 = new mraa.Gpio(8);
	pins8.dir(mraa.DIR_OUT);

var socket = require('socket.io-client')('http://192.168.1.9:1337');

socket.on('connect', function(){
  console.log('i am connected');
});

socket.on('realsense_signal', function(data){
  console.log('Hand Signal: ' + data.name);
  if(data.name=='spreadfingers'){
	pins2.write(1);
	pins8.write(1);
  } else {
	pins2.write(0);
	pins8.write(0);
  }
});

socket.on('disconnect', function(){
  console.log('i am not connected');
});

Select Install/Build, and then select Run after making sure the Intel® Edison board is connected to your computer.

Now make sure the server is up and running, and the Intel® RealSense camera and Intel® Edison board are connected to the Internet.

Conclusion

Using Intel® RealSense™ technology, this project modified the JavaScript* framework sample script to send captured gesture data to the Node.js server. But this project is only a beginning for more to come.

This is easy to code. The server broadcasts Gesture Data to any socket client that listening. The Intel® Edison board that installed with socket.io-client is listening to the broadcast from server. Because of that, Gesture Data name spreadfingers will trigger Digital Pins change state from 1 to 0 and vice versa.

Possibilities are endless. The Intel RealSense camera is lightweight, easy to bring and use. Intel® Edison is a powerful embedded PC. If we blend and connect the Intel® Edison and the Intel® RealSense™ camera with JavaScript*, it is easy to pack, code, and build an IoT device. You can create something great yet useful.

About the Author

Aulia Faqih - Intel® Software Innovator

Intel® RealSense™ Technology Innovator based in Yogyakarta, Indonesia, currently lecturing at UIN Sunan Kalijaga Yogyakarta. Love playing with Galileo / Edison, Web and all geek things.


Enabling IPP on OpenCV ( Windows* and Linux* Ubintu* )

$
0
0

To set up the environment (Windows* systems):

  • Configuration of OpenCV 3.0.0 – Enabling IPP
    • Download OpenCV 3.0.0( http://www.opencv.org/ ) and CMake-3.2.3 (http://www.cmake.org/download/ )
    • Extract OpenCV where you want and install CMake and run CMake.
    • Add OpenCV’s location as the source location and choose a location where you want your build will be created.
    • To enable IPP you have 2 options. One you can just use ‘ICV’ that is a special IPP build for OpenCV which is free and the other option is that you can use your IPP from any Intel® software tool suites ( Intel® System Studio or Intel® Parallel Studio )if you have one.
    • To go with ICV just have WITH_IPP on. ICV package will download automatically and cmake configuration will catch it.
    • In order to enable IPP from Intel® Software Suites , you need to manually add an entry for IPP as well on top of setting WITH_IPP. Click ‘Add Entry’ and type in its name as ‘IPPROOT’. Choose its type as PATH and insert where your IPP is located.
    • If configuration gets done without a problem. Then it is ready to go

 

To set up the environment (Linux* Ubuntu* systems):

  • Configuration of OpenCV 3.0.0 – Enabling IPP
    • Download OpenCV 3.0.0( http://www.opencv.org/ )
    • Extract OpenCV where you want
    • Open a terminal and go to where you extracted OpenCV
    • As the same as Windows case, you can go with either ICV or IPP
    • For ICV, type  'cmake -D WITH_IPP=ON .'
    • Example configuration result for ICV
    • For IPP, type  'cmake -D WITH_IPP=ON -D IPPROOT=<Your IPP Location> .'
    • Example configuration result for IPP
    • If the configuration went without a problem, then proceed and type 'make -j4'
    • When building is done, type 'make install' to filnally install the library

 

Using the Intel® RealSense™ Camera with TouchDesigner*: Part 1

$
0
0

Download Demo Files ZIP 35KB

TouchDesigner*, created by Derivative, is a popular platform/program used worldwide for interactivity and real-time animations during live performances as well as rendering 3D animation sequences, building mapping, installations and recently, VR work. The support of the Intel® RealSense™ camera in TouchDesigner makes it an even more versatile and powerful tool. Also useful is the ability to import objects and animations into TouchDesigner from other 3D packages using .fbx files, as well as taking in rendered animations and images.

In this two-part article I explain how the Intel RealSense camera is integrated into and can be used in TouchDesigner. The demos in Part 1 use the Intel RealSense camera TOP node. The demos in Part 2 use the CHOP node. In Part 2, I also explain how to create VR and full-dome sequences in combination with the Intel RealSense camera. I show how TouchDesigner’s Oculus Rift node can be used in conjunction with the Intel RealSense camera. Both Part 1 and 2 include animations and downloadable TouchDesigner files, .toe files, which can be used to follow along. To get the TouchDesigner (.toe) files click on the button on the top of the article. In addition, a free noncommercial copy of TouchDesigner which is fully functional (except that the highest resolution has been limited to 1280 by 1280), is available.

Note: There are currently two types of Intel RealSense cameras, the short range F200, and the longer-range R200. The R200 with its tiny size is useful for live performances and installations where a hidden camera is desirable. Unlike the larger F200 model, the R200 does not have finger/hand tracking and doesn’t support "Marker Tracking." TouchDesigner supports both the F200 and the R200 Intel RealSense cameras.

To quote from the TouchDesigner web page, "TouchDesigner is revolutionary software platform which enables artists and designers to connect with their media in an open and freeform environment. Perfect for interactive multimedia projects that use video, audio, 3D, controller inputs, internet and database data, DMX lighting, environmental sensors, or basically anything you can imagine, TouchDesigner offers a high performance playground for blending these elements in infinitely customizable ways."

I asked Malcolm Bechard, senior developer at Derivative, to comment on using the Intel RealSense camera with TouchDesigner:

"Using TouchDesigner’s procedural node-based architecture, Intel RealSense camera data can be immediately brought in, visualized, and then connected to other nodes without spending any time coding. Ideas can be quickly prototyped and developed with an instant-feedback loop.Being a native node in TouchDesigner means there is no need to shutdown/recompile an application for each iteration of development.The Intel RealSense camera augments TouchDesigner capabilities by giving the users a large array of pre-made modules such as gesture, hand tracking, face tracking and image (depth) data, with which they can build interactions. There is no need to infer things such as gestures by analyzing the lower-level hand data; it’s already done for the user."

Using the Intel® RealSense™ Camera in TouchDesigner

TouchDesigner is a node-based platform/program that uses Python* as its main scripting language. There are five distinct categories of nodes that perform different operations and functions: TOP nodes (textures), SOP nodes (geometry), CHOP nodes (animation/audio data), DAT nodes (tables and text) and COMP nodes (3D Geometry nodes and nodes for building 2D control panels), and MAT nodes (materials). The programmers at TouchDesigner consulting with Intel programmers designed two special nodes: the Intel RealSense camera TOP node and the Intel RealSense camera CHOP node to integrate the Intel RealSense camera into the program.

Note: This article is aimed at those familiar with using TouchDesigner and its interface. If you are unfamiliar with TouchDesigner and plan to follow along with this article step-by-step, I recommend that you first review some of the documentation and videos available here:

Learning TouchDesigner

Note: When using the Intel RealSense camera, it is important to pay attention to its range for best results. On this Intel web page you will find the range of each camera and best operating practices for using it.

Intel RealSense Camera TOP Node

The TOP nodes in TouchDesigner perform many of the same operations found in a traditional compositing program. The Intel RealSense camera TOP node adds to these capabilities utilizing the 2D and 3D data feed that the Intel RealSense camera feeds into it. The Intel RealSense camera TOP node has a number of setup settings to acquire different forms of data.

  • Color. The video from the Intel RealSense camera color sensor.
  • Depth. A calculation of the depth of each pixel. 0 means the pixel is 0 meters from the camera, and 1 means the pixel is the maximum distance or more from the camera.
  • Raw depth. Values taken directly from the Intel® RealSense™ SDK. Once again 0 means 1 meter from the camera and 1 is the maximum range or more away from the camera.
  • Visualized depth. A gray-scale image from the Intel RealSense SDK that can help you visualize the depth. It cannot be used to actually determine a pixel’s exact distance from the camera.
  • Depth to color UV map. The UV values from a 32-bit floating RG texture (note, no blue) that are needed to remap the depth image to line up with the color image. You can use the Remap TOP node to align the images to match.
  • Color to depth UV map. The UV values from a 32-bit floating RG texture (note, no blue) that are needed to remap the color image to line up with the depth image. You can use the Remap TOP node to align the two.
  • Infrared. The raw video from the infrared sensor of the Intel RealSense camera.
  • Point cloud. Literally a cloud of points in 3D space (x, y, and z coordinates) or data points created by the scanner of the Intel RealSense camera.
  • Point cloud color UVs. Can be used to get each point’s color from the color image stream.

Note: You can download that toe file, RealSensePointCloudForArticle.toe, to use as a simple beginning template for creating a 3D animated geometry from the data of the Intel RealSense camera. This file can be modified and changed in many ways. Together, the three Intel RealSense camera TOP nodes—the Point Cloud, the Color, and the Point Cloud Color UVs—can create a 3D geometry composed of points (particles) with the color image mapped onto it. This creates many exciting possibilities.


Point Cloud Geometry. This is an animated geometry made using the Intel RealSense camera. This technique would be exciting to use in a live performance. The audio of the character speaking could be added as well. TouchDesigner can also use the data from audio to create real-time animations.

Intel RealSense Camera CHOP Node

Note: There is also an Intel RealSense camera CHOP node that controls the 3D tracking/position data that we will discuss in Part 2 of this article.

Demo 1: Using the Intel RealSense Camera TOP Node

Click on the button on top of the article to get the First TOP Demo: settingUpRealNode2b_FINAL.toe

Demo 1, part 1: You will learn how to set up the Intel RealSense camera TOP node and then connect it to other TOP nodes.

  1. Open the Add Operator/OP Create dialog window.
  2. Under the TOP section, click RealSense.
  3. On the Setup parameters page for the Intel RealSense camera TOP node, for Image select Color from the drop-down menu. In the Intel RealSense camera TOP node, the image of what the camera is pointing to shows up, just as in a video camera.
  4. Set the resolution of the Intel RealSense Camera to 1920 by 1080.
     


    The Intel RealSense camera TOP node is easy to set up.

  5. Create a Level TOP and connect it to the Intel RealSense camera TOP node.
  6. In the Pre parameters page of the Level TOP Node, choose Invert and slide the slider to 1.
  7. Connect The Level TOP node to an HSV To RGB TOP node and then connect that to a Null TOP node.


The Intel RealSense camera TOP node can be connected to other TOP nodes to create different looks and effects.

Next we will put this created image into the Phong MAT (Material) so we can texture geometries with it.

Using the Intel RealSense Camera Data to Create Textures for Geometries

Demo 1, part 2: This exercise shows you how to use the Intel RealSense camera TOP node to create textures and how to add them into a MAT node that can then be assigned to the geometry in your project.

  1. Add a Geometry (geo) COMP node into your scene.
  2. Add a Phong MAT node.
  3. Take the Null TOP node and drag it onto the Color Map parameter of your Phong MAT node.
     


    The Phong MAT using the Intel RealSense camera data for its Color Map parameter.

  4. On the Render parameter page of your Geo COMP for the Material parameter add type phong1 to make it use the phong1 node as its material.
     


    The Phong MAT using the Intel RealSense camera data for its Color Map added into the Render/Material parameter of the Geo COMP node.

Creating the Box SOP and Texturing it with the Just Created Phong Shader

Demo 1, part 3: You will learn how to assign the Phong MAT shader you created using the Intel RealSense camera data to a box Geometry SOP.

  1. Go into the geo1 node to its child level, (/project1/geo1).
  2. Create a Box SOP node, a Texture SOP node, and a Material SOP node.
  3. Delete the Torus SOP node that was there and connect the box1 node to the texture1 node and the material1 node.
  4. In the Material parameter of the material1 node enter: ../phong1 which will refer it to the phong1 MAT node you created in the parent level.
  5. To put the texture on each face of the box, in the parameters of the texture1 node, Texture/Texture Type, put face and set the Texture/Offset put .5 .5 .5.
     


    At the child level of the geo1 COMP node, the Box SOP node, the Texture SOP node, and the Material SOP node are connected. The Material SOP is now getting its texture info from the phong1 MAT node which is at the parent level. ( …/phong1).

Animating and Instancing the Box Geometry

Demo 1, part 4: You will learn how to rotate a Geometry SOP using the Transform SOP node and a simple expression. Then you will learn how to instance the Box geometry. We will end up with a screen full of rotating boxes with the textures from the Intel RealSense camera TOP node on them.

  1. To animate the box rotating on the x-axis, insert a Transform SOP node after the Texture SOP node.
  2. Put an expression into the x component (first field) of the Rotate parameter in the transform1 SOP node. This expression is not dependent on the frames so it will keep going and not start repeating when the frames on the timeline run out. I multiplied by 10 to increase the speed: absTime.seconds*10
     


    Here you can see how the cube is rotating.

  3. To make the boxes, go up to the parent level (/project1) and in the Instance page parameters of the geo1 COMP node, for Instancing change it to On.
  4. Add a Grid SOP node and a SOP to the DAT node.
  5. Set the grid parameters to 10 Rows and 10 Columns and the size to 20 and 20.
  6. In the SOP to DAT node parameters, for SOP put grid1 and make sure Extract is to set Points.
  7. In the Instance page parameters of the geo1 COMP, for Instance CHOP/DAT enter: sopto1.
  8. Fill in the TX, TY, and TZ parameters with P(0), P(1), and P(2) respectively to specify which columns from the sopto1 node to use for the instance positions.
     


    Click on the button on top of the article to download this .toe file to see what we have done so far in this first Intel RealSense camera TOP demo.

    TOP_Demo1_forArticle.toe

  9. If you prefer to see the image in the Intel RealSense camera unfiltered, disconnect or bypass the Level TOP node and the HSV to RGB TOP node.
     

Rendering or Performing the Animation Live

Demo 1, part 5: You will learn how to set up a scene to be rendered and either performed live or rendered out as a movie file.

  1. To render the project, add in a Camera COMP node, a Light COMP node, and a Render TOP node. By default the camera will render all the Geometry components in the scene.
  2. Translate your camera about 20 units back on the z-axis. Leave the light at the default setting.
  3. Set the resolution of the render to 1920 by 1080. By default the background of a render is transparent (alpha of 0).
  4. To make this an opaque black behind the squares, add in a Constant TOP node and change the Color to 0,0,0 so it is black while leaving the Alpha as 1. You can choose another color if you want.
  5. Add in an Over TOP node and connect the Render TOP node to the first hook up and the Constant TOP node to the second hook up. This makes the background pixels of the render (0, 0, 0, 1), which is no longer transparent.

Another way to change the alpha of a TOP to 1 is to use a Reorder TOP and set its Output Alpha parameter to Input 1 and One.


Shows the rendered scene with the background being set to opaque black.


Here you can see the screen full of the textured rotating cubes.

If you prefer to render out the animation instead of playing it in real time in a performance you must choose the Export movie Dialog box under file in the top bar of the TouchDesigner program. In the parameter for the TOP Video, enter null2 for this particular example. Otherwise enter any TOP node that you want to render.


Here is the Export Movie panel, and null2 has been pulled into it. If I had an audio CHOP to go along with it, I would pull or place that into the CHOP Audio slot directly under where I put null2.

Demo 1, part 6: One of the things that makes TouchDesigner a special platform is the ability to do real-time performance animations with it. This makes it especially good when paired with the Intel RealSense Camera.

  1. Add a Window COMP node and in the operator parameter enter your null2 TOP node.
  2. Set the resolution to 1920 by 1080.
  3. Choose the Monitor you want in the Location parameter. The Window COMP node lets you perform the entire animation in real time projected onto the monitor you choose. Using the Window COMP node you can specify the monitor or projector you want the performance to be played from.
     


    You can create as many Window COMP nodes as you need to direct the output to other monitors.

Demo 2: Using the Intel RealSense Camera TOP Node Depth Data

The Intel RealSense camera TOP node has a number of other settings that are useful for creating textures and animation.

In demo 2, we use the depth data to apply a blur on an image based on depth data from the camera. Click on the button on top of the article to get this file: RealSenseDepthBlur.toe

First, create an Intel RealSense camera TOP and set its Image parameter to Depth. The depth image has pixels that are 0 (black) if they are close to the camera and 1 (white) if they are far away from the camera. The range of the pixel values is controlled by the Max Depth parameter which is specified in Meters. By default it has a value of 5 which means pixels 5 or more meters from the camera will be white. A pixel with a value of 0.5 will be 2.5 meters from the camera. Depending on how far the camera is from you changing this value to something smaller may be good. For this example we’ve changed it to 1.5 meters.

Next we want to process the depth a bit to remove objects outside our range interest, which we will do using a Threshold TOP.

  1. Create a Threshold TOP and connect it to the realsense1 node. We want to cull out pixels that beyond a certain distance from the camera so set the Comparator parameter to Greater and set the Threshold parameter to 0.8. This makes pixels that are greater than 0.8 (which is 1.2 meters or greater if we have Max Depth in the Intel RealSense camera TOP set to 1.5), become 0 and all other pixels become 1.
     

  2. Create a Multiply TOP and connect the realsense1 node to the first input and the thresh1 node to the 2nd input. Multiplying the pixels we want by 1 will leave them as-is and others by 0 make them back. The multiply1 node now has only pixels greater than 0 for the part of the image you want to control the blur we will do next.
  3. Create a Movie File in TOP, and select a new image for its File parameter. In this example we select Metter2.jpg from the TouchDesigner Samples/Map directory.
  4. Create a Luma Blur TOP and connect moviefilein1 to the 1st input of lumablur1 and multiply1 to the 2nd input of lumablur1.
  5. In the parameters for lumablur1 set White Value to 0.4, Black Filter Width to 20, and White Filter Width to 1. This makes pixels where the first input is 0 have a blur filter width of 20 and a pixels with a value of 0.4 or greater have a blur width of 1.
     


    The whole layout.

The result is an image where the pixels where the user is located are not blurred while other pixels are blurry.


The background, by putting on the display of the Luma Blur TOP shows how the image is blurred.

Demo 3: Using the Intel RealSense Camera TOP Node Depth Data with the Remap TOP Node

Click on the button on the article top to get this file: RealSenseRemap.toe

Note: The depth and color cameras of the Intel RealSense camera TOP node are in different spots in the world so their resulting images by default do not line up. For example if your hand is positioned in the middle of the color image, it won’t be in the middle of the depth image, it will either be off to the left or right a bit. The UV remap fixes this by shifting the pixels around so they align on top of each other. Notice the difference between the aligned and unaligned TOPs.


The Remap TOP aligns the depth data from the Intel RealSense camera TOP with the color data from the Intel RealSense camera TOP, using the depth to color UV data, putting them in the same world space.

Demo 4: Using Point Cloud in the Intel RealSense Camera TOP Node

Click on the button on top of the article to get this file: PointCloudLimitEx.toe

In this exercise you learn how to create animated geometry using the Intel RealSense camera TOP node point Cloud setting and the Limit SOP node. Note that this technique is different than the Point Cloud example file shown at the beginning of this article. The previous example uses GLSL shaders, which results in the ability to generate far more points, but it is more complex to do and out of the scope of this article.

  1. Create a RealSense TOP node and set the parameter Image to Point Cloud.
  2. Create a TOP to CHOP node and connect it to a Select CHOP node.
  3. Connect the Select CHOP node to a Math CHOP node.
  4. In the topto1 CHOP node parameter, TOP, enter: realsense1.
  5. In the Select CHOP node parameters, Channel Names, enter r g b leaving a space between the letters.
  6. In the math1 CHOP node for the Multiply parameter, enter: 4.2.
  7. On the Range parameters page, for To Range, enter: 1 and 7.
  8. Create a Limit SOP node.

To quote from the information on the www.derivative.ca online wiki page, "The Limit SOP creates geometry from samples fed to it by CHOPs. It creates geometry at every point in the sample. Different types of geometry can be created using the Output Type parameter on the Channels Page."

  1. In the limit1 CHOP Channels parameters page, enter r in the X Channel, g in the Y Channel, and b in the Z Channel.
     

    Note: Switching the r g and b to different X Y or Z channels changes the geometry being generated. So you might want to try this later: In the Output parameter page, for Output Type select Sphere at Each Point from the drop-down. Create a SOP to DAT node. In the parameters page, for SOP put in limit1 or drag your limit1 CHOP into the parameter. Keep the default setting of Points in the Extract parameter. Create a Render TOP node, a Camera COMP node, and a Light COMP node. Create a Reorder TOP and make Output Alpha be Input 1 and One and connect it to the Render TOP.


    As the image in the Intel RealSense camera changes, so does the geometry. This is the final layout.


    Final images in the Over TOP CHOP node. By changing the order of the channels in the Limit TOP parameters you change the geometry which is based on the point cloud.

In Part 2 of this article we will discuss the Intel RealSense camera CHOP and how to create content both rendered and in real-time for performances, Full Dome shows, and VR. We will also show how to use the Oculus Rift CHOP node.

About the Author

Audri Phillips is a visualist/3d animator based out of Los Angeles, with a wide range of experience that includes over 25 years working in the visual effects/entertainment industry in studios such as Sony, Rhythm and Hues, Digital Domain, Disney, and Dreamworks feature animation. Starting out as a painter she was quickly drawn to time based art. Always interested in using new tools she has been a pioneer of using computer animation/art in experimental film work including immersive performances. Now she has taken her talents into the creation of VR. Samsung recently curated her work into their new Gear Indie Milk VR channel.

Her latest immersive work/animations include: Multi Media Animations for "Implosion a Dance Festival" 2015 at the Los Angeles Theater Center, 3 Full dome Concerts in the Vortex Immersion dome, one with the well-known composer/musician Steve Roach. She has a fourth upcoming fulldome concert, "Relentless Universe", on November 7th, 2015. She also created animated content for the dome show for the TV series, “Constantine” shown at the 2014 Comic-Con convention. Several of her Fulldome pieces, “Migrations” and “Relentless Beauty”, have been juried into "Currents", The Santa Fe International New Media Festival, and Jena FullDome Festival in Germany. She exhibits in the Young Projects gallery in Los Angeles.

She writes online content and a blog for Intel. Audri is an Adjunct professor at Woodbury University, a founding member and leader of the Los Angeles Abstract Film Group, founder of the Hybrid Reality Studio (dedicated to creating VR content), a board member of the Iota Center, and she is also an exhibiting member of the LA Art Lab. In 2011 Audri became a resident artist of Vortex Immersion Media and the c3: CreateLAB.

Caffe* Training on Multi-node Distributed-memory Systems Based on Intel® Xeon® Processor E5 Family

$
0
0

Deep neural network (DNN) training is computationally intensive and can take days or weeks on modern computing platforms. In the recent article, Single-node Caffe Scoring and Training on Intel® Xeon® E5 Family, we demonstrated a tenfold performance increase of the Caffe* framework on the AlexNet* topology and reduced the training time to 5 days on a single node. Intel continues to deliver on the machine learning vision outlined in Pradeep Dubey’s Blog, and in this technical preview, we demonstrate how the training time for Caffe can be reduced from days to hours in a multi-node, distributed-memory environment.

Caffe is a deep learning framework developed by the Berkeley Vision and Learning Center (BVLC) and one of the most popular community frameworks for image recognition. Caffe is often used as a benchmark together with AlexNet*, a neural network topology for image recognition, and ImageNet*, a database of labeled images.

The Caffe framework does not support multi-node, distributed-memory systems by default and requires extensive changes to run on distributed-memory systems. We perform strong scaling of the synchronous minibatch stochastic gradient descent (SGD) algorithm with the help of Intel® MPI Library. Computation for one iteration is scaled across multiple nodes, such that the multi-threaded multi-node parallel implementation is equivalent to the single-node, single-threaded serial implementation.

We use three approaches—data parallelism, model parallelism, and hybrid parallelism—to scale computation. Model parallelism refers to partitioning the model or weights into nodes, such that parts of weights are owned by a given node and each node processes all the data points in a minibatch. This requires communication of the activations and gradients of activations, unlike communication of weights and weight gradients, as is the case with data parallelism.

With this additional level of distributed parallelization, we trained AlexNet on the full ImageNet Large Scale Visual Recognition Challenge 2012 (ILSVRC-2012) dataset and reached 80% top-5 accuracy in just over 5 hours on a 64-node cluster of systems based on Intel® Xeon® processor E5 family.

 

Getting Started

While we are working to incorporate the new functionality outlined in this article into future versions of Intel® Math Kernel Library (Intel® MKL) and Intel® Data Analytics Acceleration Library (Intel® DAAL), you can use the technology preview package attached to this article to reproduce the demonstrated performance results and even train AlexNet on your own dataset. The preview includes both the single-node and the multi-node implementations. Note that the current implementation is limited to the AlexNet topology and may not work with other popular DNN topologies.

The package supports the AlexNet topology and introduces the ‘intel_alexnet’ and ‘mpi_intel_alexnet’ models, which are similar to ‘bvlc_alexnet’ with the addition of two new ‘IntelPack’ and ‘IntelUnpack’ layers, as well as the optimized convolution, pooling, normalization layers, and MPI-based implementations for all these layers. We also changed the validation parameters to facilitate vectorization by increasing the validation minibatch size from 50 to 256 and reducing the number of test iterations from 1,000 to 200, thus keeping constant the number of images used in the validation run. The package contains the ‘intel_alexnet’ model in these folders:

  • models/intel_alexnet/deploy.prototxt
  • models/intel_alexnet/solver.prototxt
  • models/intel_alexnet/train_val.prototxt.
  • models/mpi_intel_alexnet/deploy.prototxt
  • models/mpi_intel_alexnet/solver.prototxt
  • models/mpi_intel_alexnet/train_val.prototxt.
  • models/mpi_intel_alexnet/train_val_shared_db.prototxt
  • models/mpi_intel_alexnet/train_val_split_db.prototxt

Both the ’intel_alexnet’ and the ’mpi_intel_alexnet’ models allow you to train and test the ILSVRC-2012 training set.

To start working with the package, ensure that all the regular Caffe dependencies and Intel software tools listed in the System Requirements and Limitations section are installed on your system.

Running on Single Node

  1. Unpack the package.
  2. Specify the paths to the database, snapshot location, and image mean file in these ‘intel_alexnet’ model files:
    • models/intel_alexnet/deploy.prototxt
    • models/intel_alexnet/solver.prototxt
    • models/intel_alexnet/train_val.prototxt
  3. Set up a runtime environment for the software tools listed in the System Requirements and Limitations section.
  4. Add the path to ./build/lib/libcaffe.so to the LD_LIBRARY_PATH environment variable.
  5. Set the threading environment as follows:
    $> export OMP_NUM_THREADS=<N_processors * N_cores>
    $> export KMP_AFFINITY=compact,granularity=fine

Note: OMP_NUM_THREADS must be an even number equal to at least 2.

  1. Run timing on a single node using this command:
    $> ./build/tools/caffe time \
           -iterations <number of iterations> \
           --model=models/intel_alexnet/train_val.prototxt
  2. Run training on a single node using this command:
    $> ./build/tools/caffe train \
           --solver=models/intel_alexnet/solver.prototxt

Running on Cluster

  1. Unpack the package.
  2. Set up a runtime environment for the software tools listed in the System Requirements and Limitations section.
  3. Add the path to ./build-mpi/lib/libcaffe.so to the LD_LIBRARY_PATH environment variable.
  4. Set the NP environment variable to the number of nodes to be used, as follows:

$> export NP=<number-of-mpi-ranks>

Note: the best performance is achieved with one MPI rank per node.

  1. Create a node file in the root directory of the application with the name of x${NP}.hosts. For instance, for IBM* Platform LSF*, run the following command:

$> cat $PBS_NODEFILE > x${NP}.hosts

  1. Specify the paths to the database, snapshot location, and image mean file in the following ‘mpi_intel_alexnet’ model files:
    • models/mpi_intel_alexnet/deploy.prototxt,
    • models/mpi_intel_alexnet/solver.prototxt,
    • models/mpi_intel_alexnet/train_val_shared_db.prototxt

Note: on some system configurations, performance of a shared-disk system may become a bottleneck. In this case, pre-distributing the image database to compute nodes is recommended to achieve best performance results. Refer to the readme files included with the package for instructions.

  1. Set the threading environment as follows:

$> export OMP_NUM_THREADS=<N_processors * N_cores>
$> export KMP_AFFINITY=compact,granularity=fine

Note: OMP_NUM_THREADS must be an even number equal to at least 2.

  1. Run timing using this command:
    $> mpirun -nodefile x${NP}.hosts -n $NP -ppn 1 -prepend-rank \

         ./build/tools/caffe time \

         -iterations <number of iterations> \

        --model=models/mpi_intel_alexnet/train_val.prototxt

  1. Run training using this command:
    $> mpirun -nodefile x${NP}.hosts -n $NP -ppn 1 -prepend-rank \

         ./build-mpi/tools/caffe train \

         --solver=models/mpi_intel_alexnet/solver.prototxt

System Requirements and Limitations

The package has the same software dependencies as non-optimized Caffe:

Intel software tools:

Hardware compatibility:

This software was validated with the AlexNet topology only and may not work with other configurations.

Support

Please direct questions and comments on this package to mailto:intel.mkl@intel.com.

Video Transcode Solutions: Simple, Fast, Efficient - Dec. 1 Webinar

$
0
0

In a world where internet video use is skyrocketing and consumers expect High Definition and ultra-high definition (UHD) 4K viewing anytime, anywhere and on any device, excel with Intel in delivering live and on-demand video faster, more efficiently, and at higher quality through the latest media acceleration technologies. Get the most performance from your media platform, and accelerate to 4K/UHD and HEVC, while reducing infrastructure and development costs.

Attend this free online webinar, Video Transcode Solutions on Dec. 1, 9 a.m. (Pacific), to learn about new media acceleration and Intel graphics technologies. Offer your cloud and communication service provider customers a customizable solution that can:

  • Deliver fast video transcoding into multiple formats and bit rates in less time
  • Reduce the amount of storage needed for multiple formats through higher compression processing and offering multiple rate control techniques
  • Allow for real-time transcoding into multiple formats from the stored format, reducing the need to store all possible media formats
  • Reduce the amount of network bandwidth needed (lower bit rates) at better video quality by compressing the video appropriately prior to transmission

Video Transcode Solutions: Simple, Fast Efficient
Online Webinar | Dec. 1, 2015 - 9 a.m. (Pacific)    

     

See how Intel can help you innovate and bring new media solutions quicker to market. Webinar is for cloud media solutions and video streaming/conferencing providers, media/graphics developers, broadcast/datacenter engineers, and IT/business decision-makers.

Speakers

Shantanu GuptaIntel Media Accelerator Products Director
Shantanu has held leadership roles in technology/solutions marketing, integration, product design/development and more areas - for 27 years at Intel.

 

Mark J. Buxton

Mark J. BuxtonIntel Media Development Products Director
Mark has more than 20 years experience leading video standards development and Intel’s media development product efforts, including products such at Intel® Media Server Studio
Intel® Video Pro Analyzer, and Intel® Stress Bitstreams and Encoder

Platform Analyzer - Analyzing Healthy and not-so Healthy Applications

$
0
0

Recently my wife purchased a thick and expensive book. As an ultrasonic diagnostician for children, she purchases many books, but this one had me puzzled.  The book was titled Ultrasound Anatomy of the Healthy Child.  Why would she need a book that showed only healthy children?  I asked her and her answer was simple: to diagnose any disease, even one not yet discovered, you need to know what a healthy child looks like. 

In this article we will act like doctors, analyzing and comparing a healthy and a not-so-healthy application.

Knock – knock – knock.

The doctor says: “It’s open, please enter.”

In walks our patient,  Warrior Wave*, an awesome game in which your hand acts as the road for the warriors to cross. It’s extremely fun to play, innovative, and uses Intel® RealSense™ technology. 

While playing the game, though, something felt a little off.  Something that I hadn’t felt before in other games based on Intel® RealSense™ technology.  The problem could be caused by so many things, but what is it in this case?  

Like any good doctor who is equipped with the latest and greatest analysis tools to diagnose the problem, we have the perfect tools to analyze our patient.

Using Intel® Graphics Performance Analyzer (Intel® GPA) Platform Analyzer, we receive a time-line view of our application’s CPU load, frame time, frames per second (FPS), and draw calls:

Let’s take a look.

Hmm… the first things that catch our eye are the regular FPS surges that occur periodically. All is relatively smooth for ~200 milliseconds and then jumps up and down severely.

For comparison, let’s look at a healthy FPS trace bellow. The game in this trace felt smooth and played well.  

No pattern was evident within the frame time, just normal random deviations.

But in our case we see regular surges. These surges happen around four times a second.  Let’s investigate the problem deeper, by zooming in on one of the surges and seeing what happening in the threads:

We can see that working thread 2780 spends most of the time in synchronization. The thread does almost nothing but wait for the next frame from the Intel® RealSense™ SDK:

At the same time, we see that rendering goes in another worker thread. If we scroll down, we find thread 2372.

Instead of “actively” waiting for the next frame from the Intel RealSense SDK, the game could be doing valuable work. Drawing and Intel® RealSense™ SDK work could be done in one worker thread instead of two, simplifying thread communication.

Excessive inter-thread communication can drastically slow down the execution and cause many problems.

Here is the example of a “healthy” game, where the Intel® RealSense™ SDK work and the DirectX* calls are in one thread. 

RealSense™ experts say: there is no point in waiting for the frames from the Intel® RealSense™ SDK. They won’t be ready any faster. 

But we can see that the main problem is at the top of the timeline.

On average, five out of six CPU frames did not result in a GPU frame. This is the cause of the slow and uneven GPU frame rate, which on average is less than 16 FPS.

Now let’s look at the pipeline to try and understand how the code is executing.  Looking at the amount of packets on “Engine 0,” the pipeline is filled to the brim, but the execution is almost empty.

The brain can process 10 to 12 separate images per second, perceiving them individually. This explains why the first movies were cut at a rate of 16 FPS: this is the average threshold at which the majority of people stop seeing a slide show and start seeing a movie.

Once again, let’s see the profile of the nice-looking game: 

Notice that the GPU frames follow the CPU frames with little shift. For every CPU frame, there is a corresponding GPU that starts execution after a small delay.

Let’s try to understand why our game doesn’t have this pattern.

First, let’s examine our DirectX* calls. The highlighted one with the tooltip is our “Present” call that sends the finished frame to the GPU. In the screenshot above, we see that it creates a “Present” packet on the GPU pipeline (marked with X’s).  At round the 2215 ms mark, it has moved closer to execution, jumping over three positions, but at 2231 ms it just disappears without completing execution.

And if we look at each present call within the trace, not one call successfully makes it to execution.

Question: How does the game draw itself if all our DirectX* Present calls are ignored?! Good thing we have good tools so we can figure this out. Let’s take a look.

Can you see something curious inside the gray oval? We can see that this packet, not caused by any DirectX* call of our code, still gets to the execution, fast and out of order. Hey, wait a minute!!!

Let's look closely at our packet. 

And now to the packet that got executed. 

Wow! It came from an EXTERNAL thread. What could this mean? External threads are threads that don’t belong to the game.

Our own packets get ignored, but an external thread draws our game? What? Hey, this tool went nuts!

No, the image is quite right. The explanation is that on the Windows* system (starting with Windows Vista*), there is a program called Desktop Window Manager (DWM), which does the actual composition on the screen. Its packets are the ones we see executing at a fast rate with high priority.  And no, our packets aren’t lost—they are intercepted by DWM to create the final picture.

But why would DWM get involved in a full- screen game? After thinking a while, I realized that the answer is simple: I have a multi-monitor desktop configuration. Switching my second monitor off the schema made the Warrior Wave behave like other games: normal GPU FPS, no glitches, and no DWM packets.

The patient will live! What a relief!

But other games still worked well even with a multi-monitor configuration, right (says the evil voice in the back of my head)?

To dig deeper, we need another tool to do that. Intel® GPA Platform Analyzer allows you to see CPU and GPU execution over time, but it doesn’t give you lower level details of each frame.

We would need to look more closely at the Direct3D* Device creation code. For this we could use Intel® GPA Frame Analyzer for DirectX*, but this is a topic for another article.

So let’s summarize what we have learned:

During this investigation we were able to detect poor usage of threads that led to FPS surges and a nasty DWM problem that was easily fixed by switching the second monitor of the desktop schema.

Conclusion: Intel® GPA Platform Analyzer is a must-have tool for initial investigation of the problem. Get familiar with it and add it to your toolbox.

About the Author:

Alexander Raud works in the Intel® Graphics Performance Analyzers team in Russia and previously worked on the VTune Amplifier. Alex has dual citizenship in Russia and the EU, speaks Russian, English, some French, and is learning Spanish.  Alex has a wife and two children and still manages to play Progressive Metal professionally and head the International Ministry at Jesus Embassy Church.

Intel® RealSenseTM Cameras and DCMs Overview

$
0
0

Introduction

The Intel® RealSense™ Depth Camera Manager (DCM) is intended to expose interfaces to streaming video from the Intel® RealSense™ camera F200 and R200, for both color and depth data streams. The camera service allows multiple Intel® RealSense™ SDK applications and a single non-SDK application to access data from the camera simultaneously, without blocking each other. Without the camera service, only one application can access the data from the camera at a time to assure that the correct data is received.

DCM Functionality

The DCM is the primary interface between the Intel RealSense camera and SDK clients via the Intel RealSense SDK APIs. The DCM exposes and manipulates all extended 2D and 3D camera capabilities to the client system. It provides a compatible interface to standard video application within the DCM environment via a virtual imaging device driver. It also manages camera control, access policy, and power management when multiple applications access the DCM. For these DCM functionalities to work properly, the DCM must be downloaded from Intel and installed on the platform that is equipped with an Intel RealSense camera. Visit https://downloadcenter.intel.com/download/25044 to download F200 and http://registrationcenter-download.intel.com/akdlm/irc_nas/7787/intel_rs_dcm_r200_2.0.3.39488.exe to download R200 DCM for Windows* 8.1 and Windows 10. The functionality of the DCM applies to different Intel RealSense camera models such as F200 and R200.

 

F200 Camera Model

 

The Intel RealSense camera F200 is the first generation of front-facing 3D cameras based on coded light depth technology. The camera implements an infrared (IR) laser projector system, VGA infrared (IR) camera, and a 2MP color camera with integrated ISP. This camera enables new platform usages by providing synchronized color, depth, and IR video stream data to the client system. The effective range of the depth solution from the camera is optimized from 0.2 to 1.0m for use indoors.

 

R200 Camera Model

 

The Intel RealSense camera R200 is the first generation of rear-facing 3D cameras based on active stereoscopic depth technology. The camera implements an IR laser projector, VGA stereoscopic IR cameras, and a 2MP color camera with integrated ISP. With synchronized color and infrared depth sensing features, the camera enables a wide variety of exciting new platform usages. The depth usage range of the camera depends upon the module and the lighting. The indoor range is up to 3 meters and the outdoor range is up to 10 meters.

 DCM Model – High level view

Figure 1: DCM Model – High level view

 

Hardware Requirements

 

For details on system requirements and supported operating systems for F200 and R200, see https://software.intel.com/en-us/RealSense/Devkit/

 

DCM Components

 

There are two DCM components: DCM service and DCM virtual driver.

DCM Service

The DCM service runs on the client machine and controls multiple application requests to operate the managed cameras. The DCM service also dispatches multiple access requests from several applications accessing the same video stream. The DCM service runs at startup and allows multiple clients applications to connect to it. The DCM Service interfaces to the camera through the Camera DLL and is the primary camera interface for all application types. The Camera DLL is camera-model specific and extends hardware functionality for each camera. Below is an example of the task manager of the system that has DCMs for F200 and R200 installed.

 The DCM Service runs at startup

Figure 2: The DCM Service runs at startup

 

DCM Virtual Driver

The DCM virtual driver is a virtual AVStream device driver that supports a compatible interface into the DCM for standard video applications. This virtual driver allows standard video applications to access the camera concurrently with multiple SDK applications.
 

Detecting the DCM Version

 

Go to the shortcut “Intel® RealSense™ SDK Gold” from the desktop, then “Samples\Sample binaries” or C:\Program Files (x86)\Intel\RSSDK\bin\win32 directory and open sdk_info. The Camera tab shows the DCM service version and other information about the cameras that are installed in the platform. For testing and development purposes, multiple major versions for the DCM can be installed on a platform. During runtime, only one camera—whether the same model or a different one—can be connected to the platform at a time.

 RealSense SDK information

Figure 3: RealSense SDK information

 

Troubleshooting

 

If the Intel RealSense camera does not stream data correctly:

  • Make sure that the DCM service exists and is running, as shown in Figure 2.

  • Check control panel to make sure that the app installed the Intel RealSense SDK Runtime during installation.

  • Make sure that the camera is connected properly.

 

Switching Cameras between DCM Runtimes

 

An SDK client can support different camera models through their respective DCM runtime. The SDK client must close any access to one camera model before switching to the next camera model. Multiple concurrent accesses from the SDK client to multiple camera models are not allowed. If an SDK client enables simultaneous access between multiple camera models, unknown behaviors are likely to occur.

 

Uninstallation

 

Before installing the new version of the DCM, uninstall any existing versions. Once you launch the DCM driver on the system that has an existing DCM installed, an uninstaller menu will prompt the uninstall option and other options.

  • Modify. Edit currently installed feature or feature settings.

  • Repair. Fix missing or corrupt files, shortcuts, or registry entries.

  • Remove. Remove the DCM from the system registries and directory structure.

     

Summary

 

The Intel RealSense Depth Camera Manager is the primary interface between the Intel RealSense camera and the Intel RealSense SDK clients. It communicates with the Intel RealSense camera through the camera DLL.

 

Helpful References

 

Here is a collection of useful references for the Intel RealSense DCM and SDK, including release notes and how to download and update the software.

 

 

About the Author

 

Nancy Le is a software engineer at Intel Corporation in the Software and Services Group working on Intel® Atom™ processor scale-enabling projects.

 

Notices

No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.

Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade.

This document contains information on products, services and/or processes in development. All information provided here is subject to change without notice. Contact your Intel representative to obtain the latest forecast, schedule, specifications and roadmaps.

The products and services described may contain defects or errors known as errata which may cause deviations from published specifications. Current characterized errata are available on request.

Copies of documents which have an order number and are referenced in this document may be obtained by calling 1-800-548-4725 or by visiting www.intel.com/design/literature.htm.

Intel, the Intel logo, and Intel RealSense are trademarks of Intel Corporation in the U.S. and/or other countries.

*Other names and brands may be claimed as the property of others.

© 2015 Intel Corporation.

Intel RealSense Technology + Unity 5 - Pro Tips

$
0
0

Jacob Pennock from Livid Interactive imparts some valuable advice on working with Intel® RealSense™ technology and Unity* 5 in this series of seven videos. See below for a description of each video.

Note: In some of the videos, Jacob references a video that describes the SDK’s emotion module. That video is no longer available because the emotion module has been deprecated, and starting in the Intel® RealSense™ SDK R5, it has been removed.

How to Set Up the Intel® RealSense™ SDK Toolkit in Unity* 5

Jacob describes how to import the Intel RealSense SDK Toolkit into Unity 5 and test your installation with a gesture-based drag-and-stretch code sample from the Intel RealSense SDK. Note: if you’ve installed version R4 or above of the Intel RealSense SDK, you can disregard the instructions to swap out 32-bit toolkit binaries and replace with 64-bit binaries, because starting with R4, the toolkit ships with 64-bit binaries built for Unity 5.

 

Optimizations for the Intel® RealSense™ SDK Toolkit - Part 1

In this two-part video on optimizations, Jacob highlights aspects of the Intel RealSense Toolkit where performance is not adequate for a production application. In Part 1 he describes how to significantly improve frame rate performance with a change to SenseToolkitMgr.Update(). He also shows how to fix a mirroring problem with one of the toolkit samples.

 

Optimizations for the Intel® RealSense™ SDK Toolkit- Part 2

In Part 2, Jacob demonstrates how to fix a performance problem with toolkit method DrawImages.Update(). In the video he creates a new, more memory-efficient method to replace an Intel private method that has a large memory footprint.

 

How to Use the Intel® RealSense™ SDK in Unity* 5 Directly (without the toolkit)

In this video Jacob details how to access the Intel RealSense SDK directly from inside your Unity 5 C# code. The Intel RealSense Toolkit can be useful for quick testing, but developers will get better results for most applications by building customized interactions with the Intel RealSense SDK. He goes over the basic steps required to create customized interactions.

 

Intel® RealSense™ SDK + Unity* 5 Pro Tip: Monitor the Frame Rate

Jacob has learned over the past 2 years of working with Intel RealSense technology that it is helpful to measure the frame rate at which RealSense is returning data. In this video he goes over one way to accomplish this.

 

Alternative Hand Tracking Modes with the Intel® RealSense™ SDK and Unity* 5

The Intel RealSense SDK includes a couple of skeletal hand tracking examples. In this video Jacob demonstrates another hand tracking technique using the Extremities mode and the segmentation image it produces. This is much faster than using skeletal hand tracking.

 

Intel® RealSense™ SDK + Unity* 5 Pro Tip: Displaying Segmentation Images

Here Jacob takes the low-resolution hand segmentation images he created in the previous video and scales them up in size to something that is presentable to the user without looking pixelated.

 

Intel® RealSense™ SDK + Unity*: 5 Pro Tip: Update in a Separate Thread

Jacob describes his new Asus laptop with an embedded Intel® RealSense™ camera, and demonstrates how to improve performance of your application by using a separate thread for most Intel RealSense SDK interactions.

 

About the Author

Jacob Pennock is an Intel® RealSense™ expert from the Bay Area. He is Senior Creative Developer, Helios Interactive Technologies & Chief Game Designer / Lead Developer, Livid Interactive.


Intel® Hardware-based Security Technologies Bring Differentiation to Biometrics Recognition Applications Part 1

$
0
0

Download PDF [PDF 1 MB]

Contents

Why Biometric Recognition is Better
How Biometric Recognition Works
The Attack Model
How Intel® Hardware-based Security Technologies Improve the Security of Biometrics Recognition

  1. Trusted Execution Environment with Intel® Software Guard Extensions
  2. Memory Protection Scheme with Virtual Machine Extensions
  3. Multiple Factor Authentication with Intel® Identity Protection Technology with One-Time Password

Link to Part 2
References

Why Biometric Recognition is Better

The security model of “Username/Password” has been used as a user’s identity certificate for years. When people need to prove that they are authorized users of a service (the usual process is to log in to a computer or an online service, such as social media, or online banking), they input their username and password. The disadvantages of this security model are obvious for a number of reasons, including:

  1. Simple passwords such as “123456” or “hello” can be cracked by a brute-force attack or a dictionary attack.
  2. A complicated password is hard to remember.
  3. Many people might use the same password across multiple sites.
  4. If a person forgets the password, after providing some other identification, he or she can reset it.

Figure 1. Password login scheme.

Figure 1. Password login scheme.

To improve the strength of passwords and the user experience, more and more service providers are beginning to use biometric identification technology as the password. With this technology, people don’t need to remember their passwords. Instead their voice, face, fingerprint, or iris is used as the identifying factor. Biometric identification factors are somewhat different from the traditional username/password security model factors:

  • Biometrics can be used to derive a long and complicate password, which offers greater security to withstand a brute-force attack.
  • Biometrics require more security protection to a biometric recognition application developer, because biological information is part of the human body and cannot be changed easily. If biometric information is stolen, it is hard for a user to revoke his or her biometric password. An attacker can duplicate a fake human part by using the stolen biometrics and use it to pass the biometric check on other user registered accounts in future.
  • Some biological characteristics such as face and voice have a high false acceptance rate. So a biometrics recognition system usually uses multi-biometric factor authentication to improve the recognition accuracy.
  • Some biometrics characteristics can be duplicated, like a recorded voice, a printed photo of a face, or a gelatin finger from a fingerprint. It is important to add a vitality detection module into the biometric recognition system to identify whether the biometrics information is from a live person or from a replicate object.

How Biometric Recognition Works

The basic flow of a biometric recognition application has five steps:

  1. Biometric information is collected by the sensor, which is connected through the I/O port.
  2. The output data format and speed is controlled by the specific device driver. Data is processed via the driver to meet the OS requirement at Ring-0 level and then sent to the biometric verification app, which is running at Ring-3.
  3. Once the app gets the data, it does some preprocessing work and extracts the feature points from the data.
  4. Next, the extracted feature points are sent to the pattern matcher and compared with registered biometric patterns in the database.
  5. Once the pattern matches with one of the registered patterns, the matcher sends the MATCH message, and the UI procedure will display that the user is logged in correctly and show the corresponding premier content to the user.

Figure 2. The flow chart of a biometric recognition program.

Figure 2. The flow chart of a biometric recognition program.

The Attack Model

In a biometric-based authentication system, the most valuable data to an attacker is the user’s biometric pattern. This pattern could be the raw data from a sensor, the extracted feature point set in memory, or the registered biological pattern stored in the database.

In general, if the biometric recognition application is designed without proper security protection, the attacker could retrieve the raw data or feature point set from memory via runtime attack by using a rootkit or malware. The attacker could also launch an offline attack to get the registered biological pattern if the registration template is stored at the local storage of the device. Moreover, the attacker could sniff the data stream from a data bus between the processor and the sensor or by using a camera or microphone near the user to get some biometric data like face pictures or voice samples for a further replay attack.

Figure 3. Possible attacks on a biometrics recognition application.

Figure 3. Possible attacks on a biometrics recognition application.

From the perspective of a biometrics recognition service developer, the design philosophy of the application should provide end-to-end protection to keep a user’s privacy safe. This includes:

  • Provide a trusted running environment to keep the integrity of the application code segment.
  • Protect the memory region, which contains the biometric pattern, from access by other applications.
  • Keep the sensitive data with strong encryption when it is in memory/local storage (or exchange the secret data between other applications or the network server).

How Intel® Hardware-based Security Technologies Improve the Security of Biometrics Recognition

Intel’s platform offers various hardware-based security technologies to satisfy the security requirements for biometric verification applications.

1. Trusted Execution Environment with Intel® Software Guard Extensions

Biometric recognition technology is being used more and more widely because of its security. Because the technology is based on the unique characteristics—face, voice, fingerprint, iris—of each dedicated person, a person’s identity is hard to steal. Biometric recognition technology takes the place of traditional password authentication and offers a good user experience.

However, with the wide use of biometric recognition technology in various consumer devices, the diversity and openness of the platform has raised some potential security threats. One threat that developers need to carefully consider is how to secure the operation of a biometric identification function on a variety of terminal devices. In particular they need to consider:

  • How to securely run the biometric sampling/modeling/matching algorithm on the terminal device
  • How to securely store biometric data template on the terminal device
  • How to establish a secure channel link between the terminal device and the cloud database of biological characteristics, to complete the cloud authentication and other operations

Developers can rely on Trusted Execution Environment (TEE) technology to build an effective hardened solution.

What is TEE?

TEE is an isolated, trusted execution environment isolated from the Rich Execution Environment (REE).

According to the Global Platform TEE System Architecture specification1, at the highest level, a TEE is an environment where the following are true:

  • Any code executing inside the TEE is trusted in authenticity and integrity.
  • The other assets are also protected in confidentiality.
    • The TEE shall resist all known remote and software attacks, and a set of external hardware attacks.
  • Both assets and code are protected from unauthorized tracing and control through debug and test features.

Intel® Software Guard Extensions Technology Overview

Intel® Software Guard Extensions (Intel® SGX) enables SW developers to develop and deploy secure applications on PC open platforms. It is a set of new instructions and memory access changes added to Intel® architecture.

Intel® SGX operates by allocating hardware-protected memory where code and data reside. The protected memory area is called an enclave. Data within the enclave memory can only be accessed by the code that also resides within the enclave memory space. Enclave code can be invoked via special instructions. An enclave can be built and loaded as a Windows* DLL.

Figure 4. Protected execution environment embedded in a process.

Figure 4. Protected execution environment embedded in a process.

One Intel® SGX technology-enabled application is built as an untrusted part and a trusted part following the Intel® SGX design framework2. When the application is running, it calls Intel® SGX special instructions to create an enclave, which is placed in trusted memory. When the trusted function is called, the code is running inside the enclave, and the relevant data can be seen in clear text only inside the enclave. Any external access to this data is denied. After the trusted function returns, the enclave data remains in trusted memory.

Figure 5. Intel® Software Guard Extensions technology-enabled application execution flow.

Figure 5. Intel® Software Guard Extensions technology-enabled application execution flow.

The objective of this Intel® technology is to enable a high-level protection of secrets. The application gains the ability to defend its own secrets thanks to Intel® SGX technology. Sensitive data is protected within applications. The attack surface, or TCB (Trusted Computing Base), is minimized as application self and processor. Even the malware that subverts OS/VMM, BIOS, drivers, etc. cannot steal the application secrets.

Figure 6. Reduced attack surface with Intel® Software Guard Extensions.

Figure 6. Reduced attack surface with Intel® Software Guard Extensions.

How to HardenBiometric Recognition Function by Intel® Software Guard Extensions Technology?

Before we discuss the security solution proposal for biometric recognition, we should address which factors should be protected during the process:

  • The user’s private biometric characteristics data should be handled carefully in application, at rest, and in flight.
  • The biometric operation algorithm, including sampling, modeling, and matching, should be protected against viruses and malware. The output result data should not be tampered with.

We proposed the architecture shown in Figure 7.

Figure 7. Hardened biometric recognition function by Intel® Software Guard Extensions.

Figure 7. Hardened biometric recognition function by Intel® Software Guard Extensions.

The biometric sampling/modeling/matching algorithm is hosted inside the Intel® SGX enclave, the trusted part of the client, and is responsible for operating on the biometric characteristics data. Its runtime confidentiality and integrity is guaranteed. This type of algorithm is normally software implementation.. Normal software implementation may be tampered with at runtime by viruses and malware. But in this architecture, at runtime, the protected portion is loaded into an enclave where its code and data is measured. Once the application’s code and data is loaded into an enclave, it is protected against all external software access. So the biometric operation algorithm can be trusted. Beyond the security properties, the enclave environment offers scalability and performance associated with execution on the main CPU of an open platform. So it is helpful for the performance sensitive scenario, especially the biometric recognition function.

Intel® SGX technology provides a function to encrypt and integrity-protect enclave secrets to store them outside the enclave, such as on disk, and the application can reuse this data later. Data can be sealed against an enclave using a hardware-derived Seal Key. The Seal Key is unique to the CPU and the specific enclave environment. Combined with other services, like Monotonic Counter or Trusted Time, which are provided by the Intel® SGX Platform Software, the solution can be used to protect against various attack techniques. The Monotonic Counter can be used to implement replay-protected policy, and the Trusted Time can be used to enforce a time-based policy. Both of them are in a form of Sealed Data. The enclave is responsible for performing the encryption with an algorithm of its choice; in other words, the developer can choose any encryption framework according to their system security requirement. So we can keep the user’s privacy biometric characteristics data to be handled only within the enclave and make sure that its raw data cannot be exposed to the untrusted part out of the enclave.

Sometimes the client biometric recognition function needs to connect to the remote back-end server to do authentication in the cloud database instead of locally. Using Intel® SGX attestation capabilities, the client authentication module authenticates the client platform and user’s biometric characteristics data with the remote server. Attestation is the process of demonstrating that a piece of software has been properly instantiated on the platform. In Intel® SGX it is the mechanism by which another party can gain confidence that the correct software is securely running within an enclave on an enabled platform.

First, this module generates a verifiable report of the client’s identity that is bound to the platform by the CPU3. The report also includes information about the user running the biometric recognition session. The server verifies the report to ensure that it is communicating with a device that is enabled with Intel® SGX. The client and server engage in a one-time provisioning protocol that results in application secrets being securely sealed to the client platform, using Intel® SGX sealing capabilities.

These secrets, which can only be unsealed by the application that sealed it, are used to establish secure sessions with the server in the future, without the need for constantly proving the identity of the client platform. Such secrets can be salt, encryption key, policy, certificate, etc. After that, the biometric characteristics data and authentication result can be sent through the secure communication channel between the client and server.

2. Memory Protection Scheme with Virtual Machine Extensions

Dynamic data attack is one of the most commonly known attack methodologies. Rootkit and malware can use this technique to hook a specified function and dump/modify data in the memory at runtime. In the case of biometrics recognition, malicious code can get the biometric data captured from the sensor and registered user biometric template from memory.

The Weakness of the Legacy Software-Based Memory Protection

Traditional software-based memory protection mechanisms are not reliable enough. Both the protection code and malicious code are running at the same privilege (ring-0 or ring-3). So malware can compromise the protection code easily to disable the protection.

Figure 8. Attacks can compromise the protection module and access the sensitive data buffer.

Figure 8. Attacks can compromise the protection module and access the sensitive data buffer.

Memory Protection Based on Virtual Machine Extensions

Virtual Machine Extensions (VMX) is a set of instructions that support virtualization of processor hardware4. Its basic working logic is:

  • Ignore the basic CPU operation like load/store, branch and ALU operations
  • Monitor (trap) the privileged instructions such as MMU manipulation, I/O instructions or update the TLB
  • If a privilege instruction is executed, break the execution and set CPU into VMX root mode for further processing

The following diagram shows the relationship between the hardware/OS/application with VMM mode enable/disable.

Figure 9. Different response to the system call when Virtual Machine Extensions mode is on/off.

Figure 9. Different response to the system call when Virtual Machine Extensions mode is on/off.

By utilizing the hardware-based trap function of VMX, a hardware virtualization-based memory protection mechanism can protect memory in a safer and faster way5 The basic idea is to insert a VMM-based memory monitor module between the OS and hardware. When loading the application, build a memory map table for the trusted code region and data region. After building the table, whenever there is memory access, VMM can trap it, then compare the memory access instruction address (EIP) and memory address with the pre-built table. Then the memory protection module can identify whether this is a legal or illegal access and apply for the corresponding process.

3. Multiple Factor Authentication with Intel® Identity Protection Technology with One-Time Password

Identity theft is a growing global concern for individuals and businesses. Secure, but simple-to-use solutions are required as hackers devise new methods for obtaining usernames and passwords. Hackers never stop devising new ways to steal usernames and passwords. If you are a consumer or an everyday computer user, Intel® Identity Protection Technology (Intel® IPT) provides strong techniques for avoiding the threat of identity theft by giving you the opportunity to link your physical device to each Intel® IPT-enabled online account that you use.

Traditionally, two-factor authentication uses a one-time password (OTP) which combines something the user knows (a username and password) and something the user has (typically, a token or key fob that produces a six-digit number, valid only for a short period of time and available on demand).

In the case of Intel® IPT with OTP6, a unique, one-time use, six-digit number is generated every 30 seconds from an embedded processor that is tamper-proof and operates in isolation from the OS. Because the credential is protected inside the chipset, it cannot be compromised by malware or removed from the device.

Figure 10. Intel® Identity Protection Technology with one-time password authentication working flow between client and server.

Figure 10. Intel® Identity Protection Technology with one-time password authentication working flow between client and server.

If your business is already using two-factor authentication, you are already familiar with the various issues around token usability and logistics. Intel® IPT with OTP is a built-in hardware token (with your security vendor of choice) that negates the need for a separate physical token, thus simplifying the two-factor VPN log-in process for a seamless experience with virtually no delays.

With Intel® IPT with OTP on Intel® processor-based devices, Intel provides a hardware root of trust, proof of a unique Intel processor-based device, to websites, financial institutions, and network services that it is NOT malware logging into an account. Intel® IPT with OTP enabled systems offer additional identity protection and transaction verification methods that can be utilized by multifactor authentication solutions.

About the Author

Jianjun Gu is a senior application engineer in the Intel Software and Solutions Group (SSG), Developer Relations Division, Mobile Enterprise Enabling team. He focuses on the security and manageability of enterprise application.

Zhihao Yu is an application engineer in Intel Software and Solutions Group (SSG), Developer Relationship Division, responsible for Intel® TEE technologies enabling and supporting secure payment solution based on Intel® platforms.

Liang Zhang is an application engineer in Intel Software and Solutions Group (SSG), Developer Relationship Division, responsible for supporting enterprise apps and Internet of Things developers based on Intel® platforms.

Link to Part 2

References

1 TEE System Architecture v1.0: http://www.globalplatform.org/specificationsdevice.asp
2 Intel® Software Guard Extensions (Intel® SGX), ISCA 2015 tutorial slides for Intel® SGX: https://software.intel.com/sites/default/files/332680-002.pdf
3 Using Innovative Instructions to Create Trustworthy Software Solutions: https://software.intel.com/en-us/articles/using-innovative-instructions-to-create-trustworthy-software-solutions
4 Intel® 64 and IA-32 Architectures Software Developer Manuals: http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html
5 Ravi Sahita and Uday Savagaonkar. “Towards a Virtualization-enabled Framework for Information Traceability (VFIT).” In Insider Attack and Cyber Security Volume 39 of the series Advances in Information Security, pp 113-132, Springer, 2008.
6 Intel® Identity Protection Technology (Intel® IPT): http://ipt.intel.com/Home
7 INTRODUCTION TO INTEL® AES-NI AND INTEL® SECURE KEY INSTRUCTIONS: https://software.intel.com/en-us/node/256280
8 Intel® RealSense™ technology: http://www.intel.com/content/www/us/en/architecture-and-technology/realsense-overview.html

The Now of Device Usage: Opportunities for Developers in 2016 and Beyond

$
0
0

By Karen Marcus

Recently, Intel UX Innovation Manager Dr. Daria Loi and her team conducted a study to determine how technology users around the world feel about their computing devices. The study, titled Current Usages: A Global Perspective, explored which devices are people’s favorites, which they use most often, which they use for certain tasks (and why), and what they wish the devices could do better. The research included 1,200 people from six countries across an age range of several decades.

The study found that, while devices such as smartphones, laptops, desktops, and tablets allow users to perform many critical work and personal functions—and that most users have a favorite device (see Figure 1)—there are many areas for improvement. This article discusses the study’s findings in detail and suggests numerous opportunities for developers to create software that can perfect users’ experience.

Figure 1. Study responses for favorite device

Figure 1.Study responses for favorite device

No One Size Fits All

Findings from the Current Usages: A Global Perspective study showed that for most people, their smartphone is their favorite device (39%), followed by laptops (30%), and then desktops (21%). But none of these devices can provide the interface to help users achieve every task as they move among work, personal chores, and play. For example, phones are great for taking pictures, listening to music, participating in social media, and communicating with others, but because of screen size and (true or perceived) security reasons, they’re not as useful for tasks like shopping, banking, image editing, and emailing.

Dr. Loi observes, “A smartphone is portable, so that’s the device people tend to reach for most often. Because it’s a favorite device, we want to think it’s perfect, but currently, there’s no perfect device, and that’s why people own multiple devices. However, not all current or emerging users—especially those in developing nations and young users—can afford multiple devices, nor do they want to carry them around. So, as technology developers, the question for us is, ‘How can we create a future that enables people to do everything they want without the bother and expense of multiple devices? How can we help them streamline?’”

In looking for such areas for convergence, developers may want to take note of what users already value in their devices. When asked about the most important features of their devices, survey participants most frequently cited as their number-one feature the operating system, and then performance, screen size, ease of use, and brand—in that order (see Figure 2). Loi notes that ease of use should be of particular interest because it has “huge repercussions for developers as ease of use absolutely needs to be there; otherwise, people won’t buy.”

Figure 2. Study responses for most important features

Figure 2. Study responses for most important features

The Device Shuffle

According to the study, some functions don’t skew strongly in any one direction in terms of which device people prefer when performing them. These functions include reading locally stored books, magazines, and text; browsing photos stored locally or online; live chat; interactive text platforms; video chatting and conferencing; and playing casual games. For these functions, says Loi, “People are moving across devices, depending on the capabilities of each one.”

They may also use different devices based on where they are. When at home, people tend to do things that require a laptop or PC, such as online purchasing, online banking, studying, and playing advanced games. At work or school, they perform tasks on laptops or PCs—including presenting reports, creating and editing documents, emailing, and reading online news or text—and smartphones, including checking the weather and using a calendar. Another location where people tend to use PCs more is Internet cafes and similar dedicated spaces, where typical tasks include using VoIP, video conferencing, and updating blogs or websites.

When on the go, smartphones are the device of choice for tasks like searching locations, navigating, taking pictures or recording videos, and listening to music. “People have expectations based on where they are,” explains Loi. “Their tolerance in terms of responsiveness is different depending on their location. What they expect to see changes when the context changes. Here, the goal for developers is to create applications that work well across contexts or are capable of changing so that the experience for users becomes seamless, no matter what the situation.”

Smartphones Getting Smarter

Whereas some tasks are device-agnostic, others are more commonly performed on a specific device. Forty-three percent of study participants identified smartphones as their device of choice for everyday tasks like checking weather, storing contacts, and using a calendar; 47% said that smartphones were best for searching locations and navigation; and 62% said that smartphones are best for recording photos and video (see Figure 3).

Figure 3. Study responses for favorite device for locate and check functions

Figure 3.Study responses for favorite device for locate and check functions

Yet, people hesitate to conduct online shopping on their smartphone because they have a sense that it’s less secure. Some participants specifically mentioned that they can’t always see the entire screen and are concerned that they might not see a button or other element they need to be aware of to ensure they’re completing the purchase according to their wishes. However, Loi says, “People would like to perform these tasks on any device, so that’s an opportunity for developers to create the software and infrastructure to enable them to do it safely and securely. This will become increasingly important as Google and Apple push their solutions for using smartphones as credit cards.”

Other potential areas for improvement with smartphones include functions to reduce people’s reliance on other devices when on the go or when those devices are not available. For example, how could a smartphone provide richer productivity capabilities? Considering that people increasingly use their phones as a primary camera, how can camera capabilities be enriched? And, since people want to listen to music on their phone, can smartphones be equipped with better functionality and user interfaces?

Developers have yet another opportunity with smartphones: making them more useful for education. Loi notes, “Teens and young adults cannot imagine living without their phones, yet these devices are rarely integrated into school curricula. In fact, we ask them to leave their devices home or to turn them off; yet, to them, a smartphone is an appealing, familiar, exciting everyday tool that offers an opportunity to make learning stick. In this context, our challenge is to work with educators to develop solutions that enable fluid, seamless, delightful, relevant, deep learning.”

PC Performance Wins

Despite the love affair people have with their phones, when it comes to certain tasks, their laptops, All-in-Ones, and notebooks are their preferred devices (see Figure 4). Specifically, 38% of users choose laptops for editing or modifying media, 47% choose them for creating or editing documents, 36% want them for updating blogs or websites, 41% use them for online banking and bill payment, 44% use them to browse products for potential purchase, and 41% use them to purchase products online. Loi notes that screen size is a primary reason for these preferences, “It’s easier to move around and do fine-grained work on a larger screen.” In addition, she says, “Many software packages aren’t available for smartphones or are prohibitively expensive. The ecosystem of software plus the practical, physical performance advantage of laptops make them the device of choice for these tasks.”

Figure 4. Study responses for favorite device for online purchasing

Figure 4.Study responses for favorite device for online purchasing

Survey participants also favored laptops for communication and entertainment functions, such as watching online videos (36%); watching locally stored videos (37%); uploading, downloading, sharing, or transferring media (38%); emailing (42%); and voice over IP (VoIP) and other online voice platforms (35%; see Figure 5). Loi explains, “These are applications that require the high performance that smartphones don’t yet have. In the future, the capabilities of PCs may migrate to smaller devices. Again, the right ecosystem, software, middleware, interface, and infrastructure—in addition to the right ports to easily transfer media—would be needed to make that happen.”

Figure 5. Study responses for favorite device for writing and talking

Figure 5. Study responses for favorite device for writing and talking

Other areas where survey participants preferred the performance and screen size of their laptops were presenting reports (59%), playing advanced games (24% for laptops and 25% for desktops), browsing or researching information online (38%), studying (47%), and reading online news or text (39%; see Figure 6)). Loi observes, “People can perform some of these functions from a smartphone, but phones don’t talk to other devices (such as projectors) as well as computers do; their performance doesn’t enable usages such as advanced gaming; and their limited screen size is an issue when engaging in focused, prolonged, and multi-tasking-rich usages such as studying or researching.”

Figure 6. Study responses for favorite device for reading, learning, and research

Figure 6.Study responses for favorite device for reading, learning, and research

Speaking of Technology

To better understand participants’ feelings about their devices, Loi and her team interviewed a number of them to gather qualitative information to enrich the quantitative data gathered through the global survey. One important finding was that people increasingly have a love–hate relationship with technology. Loi says, “People realize that they can’t live without their devices, but they also feel that they’re enslaved by them. A typical sentiment was, ‘I want to be able to continuously leverage devices to do the things that are important to me, but I also want them to be smart enough to get out of the way when I don’t need them.’ For example, if someone is in a meeting, his or her smartphone should have this contextual awareness and prevent incoming calls. There’s a lot of potential for creating machines that can learn users’ behavior and take initiatives to disengage when not needed.”

“I miss the times when we weren’t always ON.” —Aron, 20s

On the “love” side of the love–hate spectrum, study participants had things they appreciated about their devices, including that they help to streamline life, connect with others, and interface in real time. But, Loi observes, even the benefits that people like could be better. She queries, “How can we enrich the way we connect with others, beyond existing tools? What technological solutions can be developed to make us feel more present and connected with others, regardless of physical distance?”

“I want to connect to family more; we seem to be looking at [our] devices instead of talking to each other.”
—Carol, 40s

Along with an appreciation of technology comes frustration. Some of the top areas people wish they could change were that device batteries die too fast, that devices slow down or freeze, that they’re susceptible to viruses and privacy intrusions, that they’re too big for true portability, that they’re too small to store everything, and that they’re difficult to operate. A key irritation for study participants was that they want to be more mobile. Loi says, “We’re an increasingly mobile society—a society that fully relies on devices to get things done. But, to be useful tools, those devices need battery life. When I talk to users, it’s clear that charging has become an obsession for most, an impediment to being truly mobile and effective. Among other opportunities, wireless charging can provide great benefits in this context; yet, there needs to be an efficient, reliable, intuitive ecosystem to deliver that usage.” She notes that passwords are another “problem we need to solve,” along with technology changing too quickly, which would be helped through efforts to enable people to transition from older to newer operating systems and applications.

“Stuff changes quickly all the time; it’s hard to keep up.” —Raul, 20s

In response to these frustrations, study participants had thoughts about how the technology could serve them better. They expressed wanting it to be more personalized, power efficient, affordable, ubiquitous, more voice-based, and operating system agnostic, among other things. People also want technology that’s less in the way and more capable of contextually anticipating some user needs. Loi remarks, “People want power, control, and options. Another key factor is ease of use. For developers, usability has to be a top priority. If a user can’t find a function, then it’s essentially not there, and people won’t buy the application.”

“I like to make choices. I don’t want to be told what to do.” —Francis, 30s

Despite the frustration of constant change, people are excited to see what comes next. Loi says that many of the study participants described futuristic features they’d like to see on their devices—what some participants called Iron Man experiences. These features include the ability to think messages to communicate rather than having to speak or type them; live holographic images to chat with; and a super-sensitive microphone that could pick up voice commands from anywhere in the home. Loi says, “Social media and Hollywood give people ideas and expectations for what might be next. Some people don’t understand why these futuristic technologies aren’t already available. Despite their complexity, many expect them to be on the market soon, in many cases sooner than they actually will be .”

“The ability to show someone else exactly what you are imagining. . . like a 3-dimensional hologram.” —Sheila, 20s

Summary

Significant strides have been made in computing technology in recent years, and there are many things people love about their devices. Yet there are many areas in which technology could be even more convenient and user friendly. Dr. Loi of Intel headed the study, Current Usages: A Global Perspective, which set out to understand how people around the world use their devices now and how they would like to use them in the future. The study revealed the following areas of opportunity for developers:

  • Streamline functionality so that people can carry (and spend money on) fewer devices.
  • Continue to improve usability.
  • Improve security and visibility on smartphones for online shopping.
  • Improve smartphone functions (such as cameras and music access) to take the place of separate devices.
  • Make smartphones more useful for education.
  • Make PC software packages more available and useful for smartphones, and give smartphones the capacity to handle more robust applications.
  • Create a more seamless, contextual experience for users.
  • Improve communications technology to make people feel closer to those with whom they’re communicating.
  • Make battery charges last longer, and find new ways to charge devices.
  • Help people migrate from older applications and systems to newer ones.

The key, says Loi, is delivering a better vision for the future. For this to happen, she says, “It’s critical for all new developments to be grounded in an understanding of what people do, want, need, and desire. This research is meant to influence what we as an industry do based on a clear, data-driven understanding of what users are doing and what they care about.”

About the Author

Karen Marcus, M.A., is an award-winning technology marketing writer who has 18 years of experience. She has developed case studies, brochures, white papers, data sheets, articles, website copy, video scripts, and other content for such companies as Intel, IBM, Samsung, HP, Amazon Web Services, and Microsoft. Karen is familiar with a variety of current technologies and solutions, such as cloud computing, enterprise computing, personal computing, IT outsourcing, operating systems, and application development.

Intel® Studio products and Intel® XDK Installation failure 'Package signature verification failed.'

$
0
0

Installation of Intel® Parallel Studio XE, Intel® System Studio, Intel® INDE, or  Intel® XDK may fail with a message from the installer 'Package signature verification failed.'  There are two specific causes of this failure.  The most likely cause is that valid trusted root certificates cannot be found on the system.  These are needed by the installer to verify the package is good.  A secondary cause is that the installation package is corrupted, or that the package has an invalid signature or timestamp.

If valid trusted root certificates cannot be found on the system, the reasons may include:

  • The system either does not have access to the Internet, or if it does, Windows Update cannot acquire the needed certificates.  In either case, the needed certificates will need to be downloaded and installed separately.  Two certificates are needed; one to verify the digital signature, and another to verify the timestamp. 
    • The 'AddTrust External CA Root' certificate is needed to verify the digital signature and may be obtained here.
    • The 'QuoVadis Root Certification Authority' certificate is needed to verify the timestamp and may be obtained here.
  • The operating system is not a supported version.  For example, Microsoft* Windows 7* is the oldest version supported for installation of  Intel® Parallel Studio XE 2016; older versions of the OS may not have valid certificates, or the OS may not support valid certificates.  For a complete listing of supported operating systems, see the Intel® C++ Compiler 16.0 Update 1 for Windows* Release Notes (all components of Intel® Parallel Studio XE 2016 have the same OS requirements).
  • For the operating system requirements of  Intel® System Studio, Intel® INDE, or  Intel® XDK, see the Release Notes for those products.

If the installation package is corrupted, or the package has an invalid signature or timestamp, then it may have been corrupted during a download, or perhaps been tampered with or corrupted by a virus.  The solution is to obtain a fresh copy of the package and attempt the install again.

 

Intel® System Studio for Microcontrollers Getting Started for Linux*

$
0
0

< Getting Started with Intel® System Studio for Microcontroller >

 This document provides general overview of the Intel® System Studio for Microcontrollers, information how to use it for developing and debugging your applications for the Intel® QuarkTM microcontroller D1000 on Linux* platforms from the command line and from Eclipse* IDE, gives a list of compiler options and points to additional product information and technical support.

 Intel® QuarkTM microcontroller D1000 requires only mini-usb connection for Flashing, GDB Debugging with OpenOCD connection, and UART communications.

  

< Introducing the Intel® System Studio for Microcontrollers >

 Intel® System Studio for Microcontrollers is an integrated tool suite for developing and debugging systems and applications for the Intel® Quark™ microcontroller D1000 target - a configurable and fully synthesizable accelerator and microcontroller core (hereinafter is often referred to as “MCU”). Further in this document, we will refer to the Intel® System Studio for Microcontrollers as the “suite”, the “toolchain”, or the “toolset”.
The toolset consists of the following components:

  • C/C++ LLVM-based compiler with MCU support including Linker, assembler, C/C++ Run-time libraries
  • GDB Debugger with MCU support
  • OpenOCD with MCU support

You can use the toolset from the command line and from the Eclipse* Luna, Mars IDE.
The toolset supports the following host operating systems:

  • Linux* (Fedora* 19 and Ubuntu* 12.04 LTS and 14.04 LTS)

 

Installing the Intel® System Studio for Microcontrollers

Download the Intel® System Studio for Microcontrollers from the Intel Registration Center page.
Before installing the toolchain, make sure you have at least 140 Mb free space on your disk.
The name of the archive is:
- l_cembd_iqd_p_1.0.n.xxx.tgz (for Linux*)
where “n” is the “update release” number and “xxx” represents the package build number.

Install the toolchain by extracting the contents of the archive corresponding to your operating system
to a directory where you have the write access. Note that there is no default installation directory for the
toolchain. Make sure the installation directory does not have spaces.

Extract the contents of the archive to a directory where you have the write access, for example, your
$HOME directory. Use the following command:

tar –xzf l_cembd_iqd_p_1.0.0.001.tgz –C $HOME

In this example, your installation directory will be $HOME/l_cembd_iqd_p_1.0.n.xxx.

 

Installing a valid version of glibc

Make sure you have a valid version of the GNU C Library (glibc). Visit http://www.gnu.org/software/libc/ for installation.

For Fedora* it is glibc.i686. Execute the following command from a terminal as root:

yum install glibc.i686

For Ubuntu* it is ia32-libs. Execute the following command from a terminal as root:

apt-get install ia32-libs

 

Installing USB Driver

 By default, non-root users do not have access to the JTAG pods connected via USB. You must grant write access to the proper /dev/bus/usb entry every time a device is connected to be able to run OpenOCD using a non-root account.

The process can be automated by adding an udev rule:
1. Create a text file in the rules directory:

sudo vim /etc/udev/rules.d/99-openocd.rules

2. Type the following content:


SUBSYSTEM=="usb", ATTR{idVendor}=="0403", ATTR{idProduct}=="6010",MODE="0666"


3. Unplug the device and plug again (or reboot the system)
Take these steps, otherwise, OpenOCD fails to run with an error message:


Error: libusb_open() failed with LIBUSB_ERROR_ACCESS
Error: no device found
Error: unable to open ftdi device with vid 0403, pid 6010, description '*'
and serial '*'

 

4. Check what exactly we can see after pluging in D1000 board after successful installation. Type  'sudo dmesg -c'and then plug in the board to your machine and then type 'sudo dmesg -c' once again.

 

 

Compiling The Project + Debugging The Project

Please refer the attached PDF user guide for the details.

 

FirmWare Example

You can modify the firmware that comes with Intel System Studio for Microcontroller package. The below screenshot is a modified version of PushButton test from the firmware.

It detects a button push and prints out a string through the UART.

 

       

Incompatibility between fenv.h implementations in Intel® C++ and Microsoft* Visual C++* Compilers

$
0
0

Reference Number : DPD200570470, DPD200570483

Version : Intel® C++ Compiler Version 16.0 and earlier; Microsoft* Visual C++* Version 2013 and later

Operating System : Windows*

Problem Description :  An unexpected segmentation fault  or incorrect results may be seen at run-time when applications that access the floating-point environment, for example by making use of the C99 floating-point environment header file fenv.h, are built partly with the Intel C++ compiler and partly with the Microsoft Visual C++ Compiler version 2013 or later.

Cause : There are several differences between the fenv.h header file introduced in the 2013 version of Microsoft Visual C++  and the version provided In the Intel C++ compiler version 16 and earlier. These include a different size for the fenv_t data type, other differences in the definitions for fenv_t and fexcept_t and differences in the definitions of macros, especially FE_DFL_ENV.

Resolution Status : This is a known issue that may be substantially resolved in a future compiler version. However, changes to enhance compatibility of implementations of fenv.h in future Intel compilers with implementations of fenv.h in the Microsoft compiler may lead to incompatibilities with implementations of fenv.h in older Intel compilers.  

Workaround : Build the entire application with the Intel compiler.

Viewing all 461 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>