Difference between revisions of "Preventing Your Programs From Overrunning Our Computers"

 
(70 intermediate revisions by 4 users not shown)
Line 1: Line 1:
==INTRO==
+
== Introduction ==
  
:As part of being a Computer Science student, you will no doubt create and run your own programs on our [https://support.cs.jhu.edu/wiki/Category:Linux_Clients CS Linux Systems]   You also might be downloading programs from elsewhere and running then on our systems as well.
+
:As part of being a Computer Science student, you will no doubt create and run your own programs on our [[:Category:Linux_Clients|CS Linux Systems]].  You also might be downloading programs from elsewhere and running then on our systems as well.
  
 
:It's possible that software you write (or software you've downloaded and run) could, when running, exceed the existing resources on our systems.
 
:It's possible that software you write (or software you've downloaded and run) could, when running, exceed the existing resources on our systems.
  
==Computing System Resources That Could Be Exceeded==
+
== Computing System Resources that Could Be Exceeded ==
  
 
* Memory (RAM)
 
* Memory (RAM)
Line 12: Line 12:
  
 
* Disk space
 
* Disk space
** Ugrad users have a [https://support.cs.jhu.edu/wiki/Disk_Quotas disk quota ] to prevent that.
+
** ''Ugrad Net'' users have a [[Disk Quotas|disk quota]] to prevent that.
** Diskspace does not only include your home directory, but can also include your use of the Linux systems's '''/tmp''' directory.
+
** ''Grad/Research Net'' users do not have specific disk quotas but must realize that they  share their disk space with faculty, staff, researchers, other students, etc. So, they must pay close attention to the disk space they use.
 +
** Disk space not only includes your home directory, but can also include your use of the Linux systems's '''/tmp''' directory as well.
  
==Typical Symptoms Of Exceeded Computer Resources==
+
==Typical Symptoms of Exceeding Computer Resources on Our Linux Systems==
  
 
* Timeouts ssh'ing into the computer.
 
* Timeouts ssh'ing into the computer.
Line 23: Line 24:
 
* Programs stop running because they run out of RAM or swap disk space.
 
* Programs stop running because they run out of RAM or swap disk space.
  
==Who Is Affected When Your Program Uses Up Resources On One Of Our Linux Systems?==
+
==Who Is Affected When Your Program Uses Up Resources on One of Our Linux Systems?==
  
:Because all our our CS Linux computers are '''multiuser''' systems, programs that exceed the computer's resources, slowing down the system, could potentially affect '''''all''''' the users logged into those systems, remotely, or at the actual computer console.  
+
:{{red|Because all our our CS Linux computers are '''multiuser''' systems, programs that exceed the computer's resources, slowing down the system, could potentially affect '''''all''''' the users logged into those systems, remotely, or at the actual computer console.}}
  
 
:Exceeding the computer's resources could also affect programs running in the background, perhaps as part of the computer's own system software or other users' batch jobs.   
 
:Exceeding the computer's resources could also affect programs running in the background, perhaps as part of the computer's own system software or other users' batch jobs.   
  
:So... always be aware that you are not necessarily the only one logged into a system when you're running your own programs.
+
:'''''So... always be aware that you are not necessarily the only one logged into a system when you're running your own programs.'''''
  
==Tips To Prevent Your Programs From Exceeding The System Resources On Our Computers==
+
==Tips to Prevent Your Programs from Exceeding the System Resources on Our Computers==
  
*  Try to design your programs and/or datasets so that their running memory requirements don't exceed the capabilities of system they're running on.
+
*  Try to design your programs and/or datasets so that their running memory requirements '''do not exceed''' the capabilities of the system they're running on. Therefore, become familiar with the specs of our CS Linux systems:
** [https://support.cs.jhu.edu/wiki/Linux_Clients_on_the_CS_Grad/Research_Net Specs on our Grad Net Linux Boxes ]
+
** [[Linux_Clients_on_the_CS_Grad/Research_Net|Specs on our Grad/Research Net Linux Computers]]
** [https://support.cs.jhu.edu/wiki/Linux_Clients_on_the_CS_Undergrad_Net Specs on our Ugrad Net Linux Boxes ]
+
** [[Linux_Clients_on_the_CS_Undergrad_Net|Specs on our Ugrad Net Linux Computers]]
  
* If you're going to use most of a system's memory.==
+
* Try to '''avoid using the system's complete RAM memory''' as much as possible.  Put safeguards into your programs to monitor your use of memory.
** Try to avoid using the systems's complete RAM memory.  Put safeguards into your programs to monitor your use of memory.
 
*** use the '''top''' or '''htop'' programs to see your program's stats.
 
**** top -u ''username''
 
**** htop -u ''username''  or htop --user=''username''
 
  
 +
* Try to '''run your programs during low-demand times of da'''y (mornings are best; late evenings are the most busy.)
  
** try not to tie it up for longer periods of time,
+
* Try '''not to tie up the computer's resources for longe'''r than the program needs to run.
  
** try to run your programs during low-demand times of day
+
:: '''Do not forget that you are running programs.'''  One of the biggest areas we've had issues with is that some users start their programs in the background and forget that the programs are still running. And in some cases, the programs have issues and start using more resources than actually needed.
(mornings, mostly; late evenings are the most busy).
 
  
** '''''Remember that because these are multiuser systems, other users could be logged into the system and could be affected if your program exceeds the computer's resources.'''''  
+
::'''Therefore... ''please make sure you check in on your running programs often, and use monitoring tools to make sure you are not exceeding resources.'''''
  
ugradx and ugradz have 32 GB of RAM each, and the lab machines have 16
+
==Monitoring Your Processes' Memory and CPU Usage with Linux Commands==
GB.
+
:Use the '''top''' or '''htop''' programs to see your program's stats.
 +
:* '''top'''
 +
:** top -u ''username''
 +
:* '''htop'''
 +
:** htop -u ''username''
 +
:** htop --user=''username''
 +
:* '''ps'''
 +
:** ps -u ''username''
  
Keep in mind that these are all multiuser systems, so other people need to
+
==Ending or Modifying Your Running Process Using Linux Commands==
be able to use them even while your programs are running.   or
+
 
 +
===Ending a Process===
 +
 
 +
:* '''kill''' - Send a signal to a process, usually to stop the process from running (which is why it's named "kill")
 +
:** First, determine process ID (PID)
 +
:** Start by sending the HUP [https://www.tecmint.com/how-to-kill-a-process-in-linux signal]
 +
:***  Example: Kill process with PID 15476 using signal HUP:
 +
:**** <code>kill -HUP 15476</code>
 +
:** If the process is still running, use the TERM signal.  (This is the default signal if one is not specified.)
 +
:*** e.g.: <code>kill -TERM ''15476''</code>
 +
:** If the process is ''still'' running, use the KILL signal.  This forces the process to exit, without allowing it to clean up after itself, so it's recommended only of the other two signals don't have an effect.  (Some online references refer to "<code>kill -9</code>"; that the same as "<code>kill -KILL</code>".)
 +
:*** e.g.: <code>kill -KILL ''15476''</code>
 +
 
 +
:* '''pkill''' - send a signal to ''all'' processes that match the given criteria.
 +
:** As with <code>kill</code>, you can choose from the HUP, TERM, and KILL signals.  For <code>pkill</code>, it's often easiest to just start with the default signal (which is TERM).
 +
:** To signal all of you processes, use the <code>-u</code> parameter along with your account name.  (This will end ''all'' of your processes, including the shell you're using to run it.)
 +
:*** e.g.: <code>pkill -u ''username''</code>
 +
 
 +
:* From within '''top''' command
 +
:** Find desired process ID (PID)
 +
:** press '''k''' (for ''kill'')
 +
:** enter PID at prompt.
 +
:** enter signal number to use (as with <code>kill</code> above, it often works best to try 1 (HUP) first, then 15 (TERM), followed by 9 (KILL).
 +
 
 +
===Modifying a Process's Running Priority===
 +
 
 +
Every process on our systems has a priority, which defaults to 0, signifying a neutral priority.  Higher numbers, up to 20, represent ''lower'' priorities; the programs are being "nice" and allowing other programs to run more often.  A priority of 10 is twice as nice as a priority of 5, and half as nice as a priority of 20.  (Negative numbers, from -1 down to -20, represent very ''high'' priorities and are reserved for system processes.)
 +
 
 +
To ''change'' a process' priority, use the '''<code>renice</code>''' command:
 +
 
 +
<kbd>renice <var>priority</var> <var>pid</var></kbd>
 +
 
 +
:"<var>priority</var>" gives the new priority for the process, from 1 to 20.  Note that ''you can only ever raise the priority number''; once a process is at, say, 10, you can raise it to 15 but you cannot lower it to 5.
 +
 
 +
:"<var>pid</var>" is the process ID for your process.  You can get that from <code>top</code> or <code>ps</code>, [[#Monitoring Your Processes' Memory and CPU Usage with Linux Commands|as described above]].
 +
 
 +
To set a process' priority ''when you start it'', use the '''<code>nice</code>''' command:
 +
 
 +
<kbd>nice <var>your program ...</var></kbd>
 +
<kbd>nice -n <var>priority</var> <var>your program ...</var></kbd>
 +
 
 +
If you don't give a priority via the <code>-n</code> parameter, it defaults to a priority of 10.
 +
 
 +
==And, Worth Mentioning Again...==
 +
: Please remember that because these are '''multiuser''' systems,  ''other users could be logged into the system and could be affected if your running program exceeds the computer's resources.''
 +
 
 +
[[Category:Linux Clients]]
 +
[[Category:Computers Available on the CS Network]]
 +
[[Category:Troubleshooting]]

Latest revision as of 20:28, 2 October 2024

Introduction

As part of being a Computer Science student, you will no doubt create and run your own programs on our CS Linux Systems. You also might be downloading programs from elsewhere and running then on our systems as well.
It's possible that software you write (or software you've downloaded and run) could, when running, exceed the existing resources on our systems.

Computing System Resources that Could Be Exceeded

  • Memory (RAM)
  • CPU (processing power)
  • Disk space
    • Ugrad Net users have a disk quota to prevent that.
    • Grad/Research Net users do not have specific disk quotas but must realize that they share their disk space with faculty, staff, researchers, other students, etc. So, they must pay close attention to the disk space they use.
    • Disk space not only includes your home directory, but can also include your use of the Linux systems's /tmp directory as well.

Typical Symptoms of Exceeding Computer Resources on Our Linux Systems

  • Timeouts ssh'ing into the computer.
  • Programs run much slower than usual
  • Programs stop running because they run out of RAM or swap disk space.

Who Is Affected When Your Program Uses Up Resources on One of Our Linux Systems?

Because all our our CS Linux computers are multiuser systems, programs that exceed the computer's resources, slowing down the system, could potentially affect all the users logged into those systems, remotely, or at the actual computer console.
Exceeding the computer's resources could also affect programs running in the background, perhaps as part of the computer's own system software or other users' batch jobs.
So... always be aware that you are not necessarily the only one logged into a system when you're running your own programs.

Tips to Prevent Your Programs from Exceeding the System Resources on Our Computers

  • Try to avoid using the system's complete RAM memory as much as possible. Put safeguards into your programs to monitor your use of memory.
  • Try to run your programs during low-demand times of day (mornings are best; late evenings are the most busy.)
  • Try not to tie up the computer's resources for longer than the program needs to run.
Do not forget that you are running programs. One of the biggest areas we've had issues with is that some users start their programs in the background and forget that the programs are still running. And in some cases, the programs have issues and start using more resources than actually needed.
Therefore... please make sure you check in on your running programs often, and use monitoring tools to make sure you are not exceeding resources.

Monitoring Your Processes' Memory and CPU Usage with Linux Commands

Use the top or htop programs to see your program's stats.
  • top
    • top -u username
  • htop
    • htop -u username
    • htop --user=username
  • ps
    • ps -u username

Ending or Modifying Your Running Process Using Linux Commands

Ending a Process

  • kill - Send a signal to a process, usually to stop the process from running (which is why it's named "kill")
    • First, determine process ID (PID)
    • Start by sending the HUP signal
      • Example: Kill process with PID 15476 using signal HUP:
        • kill -HUP 15476
    • If the process is still running, use the TERM signal. (This is the default signal if one is not specified.)
      • e.g.: kill -TERM 15476
    • If the process is still running, use the KILL signal. This forces the process to exit, without allowing it to clean up after itself, so it's recommended only of the other two signals don't have an effect. (Some online references refer to "kill -9"; that the same as "kill -KILL".)
      • e.g.: kill -KILL 15476
  • pkill - send a signal to all processes that match the given criteria.
    • As with kill, you can choose from the HUP, TERM, and KILL signals. For pkill, it's often easiest to just start with the default signal (which is TERM).
    • To signal all of you processes, use the -u parameter along with your account name. (This will end all of your processes, including the shell you're using to run it.)
      • e.g.: pkill -u username
  • From within top command
    • Find desired process ID (PID)
    • press k (for kill)
    • enter PID at prompt.
    • enter signal number to use (as with kill above, it often works best to try 1 (HUP) first, then 15 (TERM), followed by 9 (KILL).

Modifying a Process's Running Priority

Every process on our systems has a priority, which defaults to 0, signifying a neutral priority. Higher numbers, up to 20, represent lower priorities; the programs are being "nice" and allowing other programs to run more often. A priority of 10 is twice as nice as a priority of 5, and half as nice as a priority of 20. (Negative numbers, from -1 down to -20, represent very high priorities and are reserved for system processes.)

To change a process' priority, use the renice command:

renice priority pid
"priority" gives the new priority for the process, from 1 to 20. Note that you can only ever raise the priority number; once a process is at, say, 10, you can raise it to 15 but you cannot lower it to 5.
"pid" is the process ID for your process. You can get that from top or ps, as described above.

To set a process' priority when you start it, use the nice command:

nice your program ...
nice -n priority your program ...

If you don't give a priority via the -n parameter, it defaults to a priority of 10.

And, Worth Mentioning Again...

Please remember that because these are multiuser systems, other users could be logged into the system and could be affected if your running program exceeds the computer's resources.