GCP Production
GCP Production readiness
Objectives: DO THIS ON THE HOST
The Nvidia driver, CUDA,and CUDNN should not automatically update and screw things up
Uptime 24 hours
Manual upgrade/update of libraries over long weekends, with prior notification to our clients (Driscoll's/Olam)
First, bring up a parallel machine with the new updates/upgrades, switch the redirection in Route53 in AWS to point to the new machine. Then upgrade/update the old machine or delete that instance
To switch of automatic updates edit: sudo vim /etc/apt/apt.conf.d/10periodic and turn off the following:
APT::Periodic::Update-Package-Lists “0”;
There is more to switching auto-updates/upgrades. Turn off by running the following command.
sudo dpkg-reconfigure -plow unattended-upgrades
# Files to lookout for:
/etc/apt/apt.conf.d/10periodic
/etc/apt/apt.conf.d/20auto-upgrades
/etc/apt/apt.conf.d/50unattended-upgrades
/etc/cron.daily/apt-compat
/usr/lib/apt/apt.systemd.dailyRef: https://debian-handbook.info/browse/stable/sect.regular-upgrades.html
Preparing the GCP machine with nvidia driver and cuda
Preparing the Docker for nvidia driver
GCP docker setup
GCP 'scp' issue
Ref: https://console.cloud.google.com/support/cases/detail/19312938?organizationId=599160234450
GCP cloud drive
Ref: https://cloud.google.com/storage/docs/quickstart-gsutil#create
Ref: https://cloud.google.com/compute/docs/disks/gcs-buckets
GCP cloud storage fuse
Ref: https://cloud.google.com/storage/docs/gcs-fuse
Ref: https://github.com/GoogleCloudPlatform/gcsfuse/blob/master/docs/mounting.md
Making python point to python3 and starting uwsgi flask in python3
Optimized Tensorflow serving
Dealing with concurrent requests - using nginx,uwsgi,flask
https://www.reddit.com/r/Python/comments/4s40ge/understanding_uwsgi_threads_processes_and_gil/
Last updated
Was this helpful?