Inspect CPU activity with CPU Profiler
Optimizing your app’s CPU usage has many advantages, such as providing a faster and smoother user experience and preserving device battery life.
You can use the CPU Profiler to inspect your app’s CPU usage and thread activity in real time while interacting with your app, or you can inspect the details in recorded method traces, function traces, and system traces.
The specific kinds of information that the CPU Profiler records and shows are determined by which recording configuration you choose:
- System Trace: Captures fine-grained details that allow you to inspect how your app interacts with system resources.
Method and function traces: For each thread in your app process, you can find out which methods (Java) or functions (C/C++) are executed over a period of time and the CPU resources each method or function consumes during its execution. You can also use method and function traces to identify callers and callees. A caller is a method or function that invokes another method or function, and a callee is one that is invoked by another method or function. You can use this information to determine which methods or functions are responsible for invoking particular resource-heavy tasks too often and optimize your app’s code to avoid unnecessary work.
When recording method traces, you can choose sampled or instrumented recording. When recording function traces, you can only use sampled recording.
For details of using and choosing each of these trace options, see Choose a recording configuration.
CPU Profiler overview
To open the CPU Profiler, follow these steps:
Select View > Tool Windows > Profiler or click Profile in the toolbar.
If prompted by the Select Deployment Target dialog, choose the device to which to deploy your app for profiling. If you’ve connected a device over USB but don’t see it listed, ensure that you have enabled USB debugging.
Click anywhere in the CPU timeline to open the CPU Profiler.
When you open the CPU Profiler, it immediately starts displaying your app’s CPU usage and thread activity. You should see something similar to figure 1.
Figure 1. Timelines in the CPU Profiler.
As indicated in figure 1, the default view for the CPU Profiler includes the following timelines:
- Event timeline: Shows the activities in your app as they transition through different states in their lifecycle, and indicates user interactions with the device, including screen rotation events. For information on enabling the event timeline on devices running Android 7.1 (API level 25) and lower, see Enable advanced profiling.
- CPU timeline: Shows real-time CPU usage of your app—as a percentage of total available CPU time—and the total number of threads your app is using. The timeline also shows the CPU usage of other processes (such as system processes or other apps), so you can compare it to your app’s usage. You can inspect historical CPU usage data by moving your mouse along the horizontal axis of the timeline.
- Thread activity timeline: Lists each thread that belongs to your app process and indicates their activity along a timeline using the colors listed below. After you record a trace, you can select a thread from this timeline to inspect its data in the trace pane.
- Green: The thread is active or is ready to use the CPU. That is, it’s in a running or runnable state.
- Yellow: The thread is active, but it’s waiting on an I/O operation, such as disk or network I/O, before it can complete its work.
- Gray: The thread is sleeping and is not consuming any CPU time. This sometimes occurs when the thread requires access to a resource that is not yet available. Either the thread goes into voluntary sleep, or the kernel puts the thread to sleep until the required resource becomes available.
The CPU Profiler also reports CPU usage of threads that Android Studio and the Android platform add to your app process—such as JDWP , Profile Saver , Studio:VMStats , Studio:Perfa , and Studio:Heartbeat (although, the exact names displayed in the thread activity timeline may vary). Android Studio reports this data so that you can identify when thread activity and CPU usage are actually caused by your app’s code.
Content and code samples on this page are subject to the licenses described in the Content License. Java is a registered trademark of Oracle and/or its affiliates.
Источник
Fact or Fiction: Android apps only use one CPU core
We have had multi-core processors in our PCs for over a decade, and today they are considered the norm. At first it was dual-core, then quad-core, and today companies like Intel and AMD offer high end desktop processors with 6 or even 8 cores. Smartphone processors have a similar history. Dual-core energy efficient processors from ARM arrived about 5 years ago, and since then we have seen the release of ARM based 4, 6 and 8 core processors. However there is one big difference between the 6 and 8 core desktop processors from Intel and AMD and the 6, and 8 core processors based on the ARM architecture – most ARM based processors with more than 4 cores use at least two different core designs.
While there are some exceptions, in general an 8 core ARM based processor uses a system known as Heterogeneous Multi-Processing (HMP) which means that not all the cores are equal (hence Heterogeneous). In a modern 64-bit processor this would mean that a cluster of Cortex-A57 or Cortex-A72 cores would be used in conjunction with a cluster of Cortex-A53 cores. The A72 is a high performance core, while the A53 has greater energy efficiency. This arrangement is known as big.LITTLE where big processor cores (Cortex-A72) are combined with LITTLE processor cores (Cortex-A53). This is very different to the 6 or 8 core desktop processors that we see from Intel and AMD, as on the desktop power consumption isn’t as critical as it is on mobile.
When multi-core processors first came to the desktop, a lot of questions were raised about the benefits of a dual-core processor over a single core processor. Was a dual-core 1.6GHz processor “better” than a 3.2GHz single core processor, and so on. What about Windows? Could it utilize a dual-core processor to its maximum potential. What about games – aren’t they better on single-core processors? Don’t applications need to be written in a special way to use the extra cores? And so on.
Multi-processing primer
These are legitimate questions, and of course the same questions have been asked about multi-core processors in smartphones. Before we look at the question of multi-core processors and Android apps, let’s take a step back and look at multi-core technology in general.
Computers are very good a doing one thing. You want to calculate the first 100 million prime numbers? No problem, a computer can loop round and round all day crunching those numbers. But the moment you want a computer to do two things at once, like calculating those primes while running a GUI so you can also browse the web, then suddenly everything becomes a bit more difficult.
I don’t want to go too deep here, but basically there is a technique known as preemptive multi-tasking which allows the available CPU time to be split among multiple tasks. A “slice” of CPU time will be given to one task (a process) and then a slice to the next process, and so on. At the heart of operating systems like Linux, Windows, OS X, and Android is a bit of technology called a scheduler. Its job is to work out which process should receive the next slice of CPU time.
Schedulers can be written in different ways, on a server the scheduler might be tuned to give priority to tasks performing I/O (like writing to the disk, or reading from the network), whereas on a desktop the scheduler will be more concerned with keeping the GUI responsive.
When there is more than one core available the scheduler can give one process a slice of time on CPU0, while another process gets a slice of run-time on CPU1. This way a dual-core processor, together with the scheduler, can allow two things to happen at once. If you then add more cores, then more processes can run simultaneously.
You will have noticed that the scheduler is good at slicing up the CPU resources between different tasks like calculating primes, running the desktop, and using a web browser. However a single process like calculating primes can’t be split across multiple cores. Or can it?
Some tasks are sequential by nature. To make a cake you need to crack some eggs, add some flour, make the cake mix etc, and then at the end put it into the oven. You can’t put the cake tin into the oven until the cake mix is ready. So even if you had two chefs in a kitchen you can’t necessarily save time on one task. There are steps to be followed and the order can’t be broken. You can multi-task, in that while one chef is making the cake the other can prepare a salad, but tasks which have a predefined sequence can’t benefit from dual-core processors or even 12 core processors.
However not all tasks are like that. Many operations that a computer performs can be split into independent tasks. To do this the main process can create another process and farm out some of the work to it. For example, if you are using an algorithm to find prime numbers, that doesn’t rely on previous results (i.e. not a Sieve of Eratosthenes), then you could split the work in two. One process could check the first 50 million numbers and the second process could check the second 50 million. If you have a quad-core processor then you could split the work into four, and so on.
But for that to work the program needs to be written in a special way. In other words the program needs to be designed to split the workload into smaller chunks rather than doing it in one lump. There are various programming techniques for doing this, and you might have heard expressions like “single-threaded” and “multi-threaded.” These terms broadly mean programs which are written with just one executing program (single-threaded, all lumped together) or with individual tasks (threads) which can be independently scheduled to get time on the CPU. In short, a single-threaded program won’t benefit from running on a multi-core processor, whereas a multi-threaded program will.
OK, we are almost there, just one more thing before we look at Android. Depending on how an operating system has been written, some actions that a program performs can be multi-threaded by nature. Often the different bits of an OS are themselves independent tasks and when your program performs some I/O or maybe draws something to the screen that action is actually carried out by another process on the system. By using of what is known as “non-blocking calls” it is possible to get a level of multi-threading into a program without actually specifically creating threads.
This is an important aspect for Android. One of the system level tasks in Android’s architecture is the SurfaceFlinger. It is a core part of the way Android sends graphics to the display. It is a separate task that needs to be scheduled and given a slice of CPU time. What this means is that certain graphic operations need another process to run before they are complete.
Android
Because of processes like the SurfaceFlinger, Android benefits from multi-core processors without a specific app actually being multi-threaded by design. Also because there are lots of things always happening in the background, like sync and widgets, then Android as a whole benefits from using a multi-core processor. As you would expect Android has the ability to create multi-threaded apps. For more information on this see the Processes and Threads section in the Android documentation. There is also some multi-threaded examples from Google, and Qualcomm have an interesting article on programming Android apps for multi-core processors.
However the question still remains, are the majority of Android apps single-threaded, and as such only use one CPU core? This is an important question because if the majority Android apps are single-threaded then you could have a smartphone with monster multi-core processor, but in reality it will perform the same as a dual-core processor!
There seems to be some confusion about the difference between quad-core and octa-core processors. In the desktop and server world octa-core processors are built using the same core design replicated across the chip. However for the majority of ARM based octa-core processors there are high performance cores and core with better energy efficiency. The idea is that the more energy efficient cores are used for more menial tasks while the high performance cores are used for the heavy lifting. However it is also true that all the cores can be used simultaneously, like on a desktop processor.
The key thing to remember is that an octa-core big.LITTLE processor has eight cores for power efficiency, not for performance.
Testing
Using this tool, which doesn’t have a name yet, I ran a series of different types of apps (gaming, web browsing etc) on a phone with a quad-core Qualcomm Snapdragon 801 processor and again on a phone with an octa-core Qualcomm Snapdragon 615 processor. I have collated the data from these test runs and with the help of Android Authority’s Robert Triggs, I have generated some graphs which show how the processor is being used.
Let’s start with an easy use case. Here is a graph of how the cores in the Snapdragon 801 are used when browsing the web using Chrome:
The graph shows how many cores are being used by Android and the web browser. It doesn’t show how much the core is being used (that comes in a moment) but it shows if the core is being utilized at all. If Chrome was single-threaded then you would expect to see one or two cores in use and maybe a blip up to 3 or 4 cores occasionally. However we don’t see that. What we see is the opposite, four cores are being used and it occasionally dips down to two. In the browsing test I didn’t spend time reading the pages that loaded, as that would have resulted in no CPU use. However I waited until the page was loaded and rendered, and then I moved on to the next page.
Here is a graph showing how much each core was utilized. This is an averaged-out graph (as the real one is a scary scrawl of lines). This means that the peak usages are shown as less. For example the peak on this graph is just over 90%, however the raw data shows that some of the cores hit 100% multiple times during the test run. However it still gives us a good representation of what was happening.
So what about an octa-core? Will it show the same pattern? As you can see from the graph below, no it doesn’t. Seven cores are consistently being used with the occasional spike to 8, and a few times when it dips to 6 and 4 cores.
Also the averaged core usage graph shows that the scheduler behaved quite differently since the Snapdragon 615 is a big.LITTLE processor.
You can see that there are two or three cores which run more than the others, however all the cores are being utilized in some way or another. What we are seeing is how the big.LITTLE architecture is able to swap threads from one core to another depending on the load. Remember the extra cores are here for energy efficiency, not performance.
However I think we can safely say that it is a myth that Android apps only use one core. Of course this is to be expected since Chrome is designed to be multi-threaded, on Android as well as on PCs.
Other apps
So that was Chrome, an app that is designed to be multi-threaded, what about other apps? I ran some tests on other apps and briefly this is what I discovered:
- Gmail – On a quad-core phone the core usage was evenly split between 2 and 4 cores. However average core utilization never went above 50% which is to be expected as this is a relatively light app. On an octa-core processor the core usage bounced between 4 and 8 cores, but with a much lower average core utilization of less than 35%.
- YouTube – On a quad-core phone only 2 cores were used, and on average at less than 50% utilization. On an octa-core phone YouTube mainly used 4 cores with the occasional spike to 6, and drop to 3. However the average core utilization was just 30%. Interestingly the scheduler heavily favored the big cores and the LITTLE cores were hardly used.
- Riptide GP2 – On a phone with a quad-core Qualcomm processor this game used two cores most of the time with the other two cores doing very little. However on an phone with an octa-core processor, between six and seven cores where used consistently, however most of the work was done by just three of those cores.
- Templerun 2 – This game probably exhibits the single-threaded problem more than the other apps I tested. On an octa-core phone the game used between 4 and 5 cores consistently and peaked at 7 cores. However really only one core was doing all the hard work. On a quad-core Qualcomm Snapdragon 801 phone, two cores shared the work fairly evenly, and two cores did very little. On a quad-core MediaTek phone all four cores shared the workload. This highlights how a different scheduler and different core designs can drastically alter the way the CPU is used.
Here is a selection of graphs for you to peruse. I have included a graph showing the octa-core phone idle, as a base reference:
Idle: Octa-core phone with screen active but no user interactions.
YouTube running on a quad-core phone.
YouTube running on an octa-core phone.
TempleRun 2 on a quad-core phone.
TempleRun 2 on an octa-core phone.
TempleRun 2 on a quad-core MediaTek phone.
Gmail running on a quad-core phone.
Gmail running on an octa-core phone.
Riptide GP2 running on quad-core phone.
Riptide GP2 running on octa-core phone.
One interesting app was AnTuTu. I ran the app on the octa-core phone and this is what I saw:
As you can see, the latter part of the test completely maxes out all the CPU cores. It is clear that the benchmark is artificially creating a high workload, and since nearly all the cores are running at full speed then SoCs with more cores will score better for that part of the test. I never saw this kind of workload on any normal apps.
In one way it is the benchmarks which are artificially inflating the performance benefits of octa-core phones (rather than the power efficiency advantages). For a more comprehensive look at benchmarking check out Beware of the benchmarks, how to know what to look for.
Why are light apps using 8 cores?
If you look at an app like Gmail you will notice and interesting phenomenon. On a quad-core phone the core usage was evenly split between 2 and 4 cores, but on an octa-core phone the app used between 4 and 8 cores. How come Gmail can run on 2 to 4 cores on a quad-core phone but needs at least four cores on an octa-core phone? That doesn’t make sense!
The key again is to remember that on big.LITTLE phones not all the cores are equal. What we are actually seeing is how the scheduler is using the LITTLE cores then as the workload increases the big core are brought into play. For a while there is a small amount of crossover and then the LITTLE cores go to sleep. Then when the workload decreases the opposite happens. Of course this is all happening very fast, thousands of times per second. Look at this graph which shows the utilization of big vs LITTLE cores during my testing of Epic Citadel:
Notice how at first the big cores are being used and the LITTLE cores are inactive. Then, at around the 12 second mark, the big cores start to be used less and the LITTLE cores spring to life. At the 20 second mark the big cores increase their activity again and the LITTLE cores go back down to almost zero usage. You can see this again at the 30 second mark, the 45 second mark, and at the 52 second mark.
At these points the number of cores being used fluctuates. For example, in the first 10 seconds only 3 or 4 cores are being used (big cores), and then at the 12 second mark the core usage peaks at 6 and then drops again to 4, and so on.
This is big.LITTLE in action. A big.LITTLE processor isn’t designed like the octa-core processors for PCs. The extra cores allows the scheduler to pick the right core for the right job. In all my tests I did not see any real-world apps that used all 8 cores at 100%, and that is how it should be.
Caveats and wrap-up
The first thing to underline is that these tests do not benchmark the performance of the phones. My testing only shows if Android apps run across multiple cores. The advantages or disadvantages of running over multiple core, or running on a big.LITTLE SoC, are not covered. Neither are the benefits or drawbacks of running parts of an app on two cores at 25% utilization, rather than on one core at 50%, and so on.
Secondly, I haven’t yet had a chance to run these tests on a Cortex-A53/Cortex-A57 setup or a Cortex-A53/Cortex-A72 setup. The Qualcomm Snapdragon 615 has a quad-core 1.7GHz ARM Cortex A53 cluster and a quad-core 1.0GHz A53 cluster.
Источник