You come to us with each a unique set of experiences with computing, with more or less experience depending on your previous needs.
A challenge we have seen for the many years we've been helping people is understanding the context of computing in their research to understand the tools they have available
You may already have an idea of what this is, and experience with computing but many who come to us know it's valuable but are ready to learn why.
the MSU HPCC is a great example when on-premise is more beneficial
Of course you know what is in a computer. The goal is to come to common understanding, and to frame for extension to cloud, and to find the cloud services that mimic these features. If you hadn't thought about these components, that proves that Steve Jobs has won in abstracting away the computer fro model T to car. Who knows how to change their oil, or diff between carbereture and turbo charging?
I/O is used for screen, printer, camera, sound, internal disk and network connection. I/O attachs to 'ports' on computer. for Servers, we assign a 'port' a number. For cloud we have only the network connection Q's 1) has anyone used mounted storage from a network file server? Open question: where is the data? A: everywhere
you run a server on your laptop
The 'Client/server' model invented in the 60s is so successful that we use servers for our daily lives and don't think about it (except when the server is down). This model of computing is important because it's at the basis for of cloud computing.
this is only part of the message but is a necessary part
what is the host in this URL? What is the message? We could spend a week talking about web servers, protocols and a year about programming web server. The important thing is that there is a host, the 'web server' software on the host, and the client(s) connecting to it to get something.
Example servers that do no use web clients are data servers, for example relational database server.
I can run a web server right on my laptop, but you couldn't reach it. the network is me talking to myself
why do I think this is important? not only can you make a server (web, data, cluster, etc with cloud but everything you interact with in cloud is a server. You will see many services dedicated to networking int he cloud and this is why/
D2L course management system web-based system required
MSU originally ran, or 'hosted' started by aquiring the web application software and associated systems, building all of the hardware (servers), disk space to hold it all, newtork to connect them, data center space and when you connected to D2L, you connected to a system on the MSU campus.
- Scalability : when demand was very high, the system was overwhelmed
- Maintainablility: required many people to keep it running
- Cost: The company "bright space" offered to host D2L for institutions. The student experience is identical. MSU switch to that model and saved money. D2L was slow this week becuase now when we access D2L we share infrastructure with everyone in the world.
screen shot of AWS portal for creating a server
## Other Web-based services that are _not_ cloud computing
- **Web hosting** Focused on providing offered many of these features but was limited in service offerings. I've used a company called dreamhost since early 2000 to provide websites for non-profits and commercial customers, but also email and storage and limited database services.
- **Co-location** Bring your own hardware, eg. Data Center only
- **Server Rental** Servers on the internet you could use for various things, primarily web sites & applications. (Rackspace)
- **Other remote computing services** example sending your accounting data to an external service for processing (which now seems quaint). EDS from the 80s 90s by Ross Perot provided IT and Data services to major corporations primarily GM.
discussion
You may have looked at the various websites and poked around the web, and found it's just not clear at all how cloud computing may be helpful to you, even though it all sounds great. The challenge for researchers learning about cloud is that most cloud documentation for isn't written for you.
Cloud training and documentation are mostly written for IT professionals like system admins and architects, software developers, business people, and agency managers. Researchers tend to be a little of all of those things.
ofen have an embedded conceptual models of computing, and this model depend on your approach.
even slightly, making technology-specific tutorials obsolete in months. For example last year Azure had a "Notebook Service" for running Python notebooks, and now they poibnt you to "azure machine learning
To enhance reproducibility in your own work, consider documenting all the steps needed for create the environment to run your computation. For many on-premise academic systems (e.g. the MSU HPCC), we depend upon the system administrators to create that environment, but we may install and configure all the software we need to run our code. Workflow thinking can apply to the scienfic domain itself (e.g. "Principles for data analysis workflows" https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1008770 ) and to the provisioning of the cloud computing environment. That is, we may use a workflow system for creating all the cloud stuff we need, and then a different workflow system that runs on that cloud stuff. One example is we may [create an HPC system on Azure using templates](https://azure.microsoft.com/en-us/resources/templates/create-hpc-cluster/) and then launch the Slurm scheduler on that HPC to run our jobs. (*note the complexity of running our own HPC is beyond the scope of this fellowship and used as an example only*)
example for a gene network ML inference : a machine to run the ML would be $650/month, but if provisioned only when needed, it's 5 cents/job
we will show an example project from 2021
Attackers may use the services you create to launch attacks on other services, leaving you liable.
- The "Shared responsibility" model for cloud computing takes a model of computing components, and shows how much of each component the user is responsible for security.
![Microsoft Model of Shared Responsibility](../img/microsoft-shared-responsibility.svg)
*[Microsoft Model of Shared Responsibility for Cloud Computing](https://docs.microsoft.com/en-us/azure/security/fundamentals/shared-responsibility)*
We will come back to this model as we gain deeper understanding of research computing on the cloud.
- Big Data systems
- Long-running Data Systems like database servers
- Web-based applications
- Windows
- Systems with complex or specific configuration needs
so We will talk about costs extensively
---
## Value Proposition of Cloud Computing
- Costs are more than just dollars for services. Consider `[Total Cost] = ( $ + Time + Risk )`
- `[Total Time] = ( development time + wait time + compute time ) `
- Security Risks are rarely non-significant, so factor that into cost
- In the Service level spectrum, the higher level "platform" services may have higher monetary costs but often reduce time and risk
Summary and additional comments
summary: freely explore cloud services using the portal as there are often free-tiers; try the programming interfaces at least once as this will make your work reproducible ; security is always a concern and consideration of cost ; look to the higher level services, even though more expensive may be faster to results and more secure
Types of services ( do not say "as a service or AAS)
software, platform, infrastructure
describe each, 3 is not part of CCF. some
do not fit easily in this framework ( GEE ). on 2nd 2
Pre-cloud history : important because services now use these words and metaphors to map on to services. the people who used tech historically were customers of the cloud so the cloud had to make sense to the IT people it was sold to.
Client/Server: email, shared files, unix she'll, ftp. server = software that listens and acts
web - more passive than any of the others.
web search server system.
server can be in office or on web. if on the web it's a computer that hosts server
a website needs 1) web content 2) web server software 3 ) computer on the internet to run that software. 1 is easy.
web site vs web application distinction.
web 2.0so popular , web platform companies showed up in 90s : dreamhost, rack space
service was limited to stuff to run a web app = content + code, server, db, storage
( describe the customer , who uses this service ). example Bee Database
advantages vs running your own web server
big web company needed to still run their own stuff.
different topic
enterprise computing : bought rooms full of expensive hardware, networks and software: company data systems, file storage , company run email, moving huge data , etc. "main frame"
HPC : bought rooms full of really expensive hardware and really expensive networks. software was free.
computer == box. each box needs management hardware & software.
box for not flexible. solution : virtualization
one huge box, many virtual computers. can create
"provisioning" is hard work. hardware,operating systems, software etc.
how annoyed are you when you have to run an update? imaging you had hundreds or thousands of boxes to keep updated, that hundreds/thousands of people depend on. then hard drives crash. chips fail etc
every work group has special needs.
what does this mean for researchers? we have an HPC to run stuff. why do we also need cloud.
- window apps
- special configuration: eg getting the Unity game engine to run on our HPC ( for evolution simulations ) meant hacking the binary header of the program to trick it. many others are not so lucky that a solution is ever found.
data systems
web based programs.
the HPC is amazing but it is not great for everyone or everything.
Researchers had to DIY ( my friend Jason ) , find a computer person( like I was in the past ) or beg the IT department to get them what they need.
other things were put in place for enterprise or google ( containers, custom hardware etc ) but
## Source Materials
https://softwaresim.com/blog/introduction-to-cloud-computing-for-research/