Skip to main contentdfsdf

Sudhir Pandey's List: software profiling

    • To eliminate data cache misses, use arrays for data structures rather than linked lists, if possible, and reduce the size of elements in data structures, and access memory in a cache-friendly way. In any event, valgrind helps to point out which accesses/data structures should be optimized. This application run summary shows that data accesses are the main proble
    • , for example—may make it possible to reduce the number of caches misses and increase performance. cachegrind and oprofile are great tools to find information about how an application is using the cache and about which functions and data structures are causing cache misses.
    • Column

      Explanation

      User time (seconds)

      This is the number of seconds of CPU spent by the application.

      System time (seconds)

      This is the number of seconds spent in the Linux kernel on behalf of the application.

      Elapsed (wall-clock) time (h:mm:ss or m:ss)

      This is the amount of time elapsed (in wall-clock time) between when the application was launched and when it completed.

      Percent of CPU this job got

      This is the percentage of the CPU that the process consumed as it was running.

      Major (requiring I/O) page faults

      The number of major page faults or those that required a page of memory to be read from disk.

      Minor (reclaiming a frame) page faults

      The number of minor page faults or those that could be filled without going to disk.

      Swaps

      This is the number of times the process was swapped to disk.

      Voluntary context switches

      The number of times the process yielded the CPU (for example, by going to sleep).

      Involuntary context switches:

      The number of times the CPU was taken from the process.

      Page size (bytes)

      The page size of the system.

      Exit status

      The exit status of the application.

    • You can see in Listing 4.1 that the elapsed time (~3 seconds) is much greater than the sum of the user (0.9 seconds) and system (0.13 seconds) time, because the application spends most of its time waiting for input and little time using the processor.

    2 more annotations...

    • 7. Finding memory Leak (ps –sort pmem)

       

      A memory leak, technically, is an ever-increasing usage of memory by an application.

       

      With common desktop applications, this may go unnoticed, because a process typically frees any memory it has used when you close the application.

       

      However, In the client/server model, memory leakage is a serious issue, because applications are expected to be available 24×7. Applications must not continue to increase their memory usage indefinitely, because this can cause serious issues. To monitor such memory leaks, we can use the following commands.

      • When CPU switches from one process (or thread) to another, it is called as context switch.
      •  
      • When a process switch happens, kernel stores the current state of the CPU (of a process or thread) in the memory.
      •  
      • Kernel also retrieves the previously stored state (of a process or thread) from the memory and puts it in the CPU.
      •  
      • Context switching is very essential for multitasking of the CPU.
      •  
      • However, a higher level of context switching can cause performance issues.
      • The unused RAM will be used as file system cache by the kernel.
      •  
      • The Linux system will swap when it needs more memory. i.e when it needs more memory than the physical memory. When it swaps, it writes the least used memory pages from the physical memory to the swap space on the disk.
      •  
      • Lot of swapping can cause performance issues, as the disk is much slower than the physical memory, and it takes time to swap the memory pages from RAM to disk.
    • 4. Execute Strace on a Running Linux Process Using Option -p

       

      You could execute strace on a program that is already running using the process id. First, identify the PID of a program using ps command.

       

      For example, if you want to do strace on the firefox program that is currently running, identify the PID of the firefox program.

       
    • Long-running server applications can easily execute millions of common data-intensive system calls each day, incurring large data copy overheads.
    • Applications like FTP, HTTP, and Mail servers move a lot of data across the user-kernel boundary. It is well understood that this cross-boundary data movement puts a significant overhead on the application, hampering its performance. For data-intensive applications, data copies to user-level processes could slow overall performance by two orders of magnitude [33]. For example, to serve an average Web page that includes five external links (for instance images), a Web server executes 12 read and write system calls. Thus a busy Web server serving 1000 hits per second will have executed more than one billion costly data-intensive system calls each day.  
      <!---->
1 - 13 of 13
20 items/page
List Comments (0)