Some points to remember while selecting solutions using AWS
Basic, Developer, Business, and Enterprise
– Simple storage service
– Key-based Object storage
– object size 0-5 TB
– Max put size – 5 GB
– Use multiple upload feature for size >100 MB
– Use Multipart Upload API if object size > 5GB
S3 storage types
- S3 Standard 99.99999999% Available
- S3 Intelligent-Tiering – automatically changes the type based on access
- S3 Standard-Infrequent Access (S3 Standard-IA) – Infrequent object access but the fast response when accessed. 99.9% Available
- S3 One Zone-Infrequent Access (S3 One Zone-IA) – Availability: 99.50 %.Data stored redundantly within a single Availability Zone. Good for backup copies or less frequently accessed data. If IA zone is lost, data is also lost.99% Availability.
- Amazon S3 Glacier (S3 Glacier) – (Early deletion fees, if deleted <90 days)
- Amazon S3 Glacier Deep Archive (S3 Glacier Deep Archive) for long-term archive and digital preservation.
If low cost and instant access are required, then use “S3 Standard-Infrequent Access”. If more low cost and if it is fine to lose on availability for some time, then use “S3 One Zone-Infrequent Access”. But don’t use “S3 One Zone-Infrequent Access” for critical files, as it keeps data in single AZ, and data will be lost if that AZ is down.
Glacier objects can be retrieved by using REST APIs and S3 console.
- Amazon Macie -AI-powered security service to prevent data loss
- Data corruption check: Use Content-MD5 checksums or cyclic redundancy checks (CRCs)
- One bucket can have objects having different storage types
- S3 Inventory: provides scheduled reports of s3 objects
- S3 Batch Operations: with one click we can change thousands of objects
- Amazon S3 Object Lock: Additional lock from Amazon to prevent deletion of versions of the object for the specified retention period. Required for regulatory cases.
- S3 transfer acceleration can help to speed up the file upload process. Users can upload files in regions (edge locations) near to them and that will send it to main s3 bucket.
- Cloudfront is the CD network from AWS. There is a time to live value in each edge location and if we want to clear the cache or delete a file before TTL, then we get charged.
- S3 Cross-Region Replication helps to create multiple copies of objects. Version should be enabled at both source and destination buckets and regions must be unique.
- When CRR is enabled, existing files do not automatically get replicated, but all newer ones will.
- Delete markers are also not replicated.
- Lifecycle policies can automate the process of moving objects between different storages or add rules for objects.
EC2 (Amazon Elastic Compute Cloud)
On Demand: pay by hourly
Reserved: Long term contracts, like 1-3 years and cheap
On-Spot: You can bid on the EC2 price.
Dedicated: Where a user is limited by software licenses or regulations
On-Spot pricing- If AWS terminates on spot instance, then you will not be charged for a partial hour, but if you terminate, then you will be charged. Use On Spot where you don’t need Ec2 to be available all the time or you want for some period only. Use Reserved or scheduled reserved for a low cost.
- Boot device can be on local instance store or in EBS. EBS is preferred.
- Users are limited to run up to 20 On-Demand Instances
- EC2 termination protection is “turned off” by default. We need to turn it on.
- The root volume is also deleted if an instance is killed.
- EBS root volume is by default not encrypted and cannot be directly encrypted. We need to use 3rd party tools or AMI image to encrypt it.
- EC2 instance and EBS can be in same Availability zone only.
- Instance metadata and user data: we can get from this ip : http://188.8.131.52/latest/meta-data or http://184.108.40.206/latest/user-data
- Users and Policy Documents are applied globally.
- By default, all inbound is blocked and outbound is allowed.
- Security groups are stateful whereas NACL is stateless.
- One security group can be tied to many instances and one instance can have many security groups.
Solid state drives:
General purpose : IOPS < 16K
Provisioned IOPS : IOPS < 64K
Hard disk drives:
Throughput optimized: IOPS < 500
Cold HDD: IOPS < 250
Magnetics: rarely used. IOPS < 200
- Hypervisor for EC2: Nitro and Xen
- CloudWatch: It is used for monitoring performance. By default, it monitors services every 5 minutes. We can create alarms from cloud watch.
- Cloudtrail is for auditing. It has logs like who created any user/instance.
Clustered placement group: cannot spread across multiple zones
Spread placement group: can spread across multiple zones
- RDS is not serverless, only Aurora is serverless
- Elastic cache: Memcached and Redis
- Memcached: simple and scale horizontal cache
- Redis: Advance and multi-availability zone
- Read Replicas feature: Can be Multi-AZ, increase performance
- Multi-AZ: For Disaster recovery and not for performance
- Redshift: Used to analyze data from RDS. Can be in 1 AZ
- Aurora: keeps 6 copy of data . 2 copy in 3 distinct AZ. Aurora has Aurora replica which supports automatic failover.
Route 53 and VPC
ELBs do not have any IP addresses, we resolve them by DNS name If given preference from alias name or cname, use alias name for services.
Simple – One record to multiple IP addresses. No health check options available.
Weighted – Specify percentage base routing.
Latency-based – Based on latency
Failover – Active/Passive.
Geolocation – Location-based, like US customer to US regions.
Geo proximity (Traffic only) – Allows advanced setting based on location, longitudes, countries to specific urls.
Multivalue answer – Same as simple with health check options.
When we create a new VPC, we get default route table, NACL and security group.
We need to create subnets and default internet gateway.
Amazon reserves 5 IPs within subnets
1 subnet = 1 AZ
1 gateway per 1 VPC
Security groups cannot span VPC
NAT Instances: We can create NAT instances in public subnets to have outbound traffic to private instance. Behind security groups.
NAT Gateways: Preferred over NAT instances. 5Gbps to 45 Gbps. No need to patch or apply security groups.
Automatically gets an IP address, which needs to be added in route tables.
Load Balancers Types:
- Application Load balancer – Typically used.
- Network Load balancer – Extreme performance
- Classic load balancer – low cost
- Load balancer check instances and keep them as “InService”/ “OutOfService”
- 504 error – web service/database time out
- We can’t get IP for Application/Classic LB but can get a IP address for the Network load balancer
SQS – Simple queue service – decouple your infrastructure. Pull-based. 256 bytes size
Standard – not ordered and can be delivered more than once
FIFO – ordered and can be delivered only once
Long pooling doesn’t return a response until a message arrives and can save money.
- SQS is message-oriented, SWF (Simple workflow) is task oriented.
- SNS: Push-based.
- Elastic transcoders : Media transcoders in cloud.
- Kinesis: Processing of streaming big data in real-time
- Kinesis stream has data persistence, Firehose is for analyzing data stream and no persistence.