Storage classes
Amazon S3 offers a range of storage classes designed for different use cases. For example,
S3 Standard or
S3 Express One Zone for frequent access.
S3 Standard-IA or
S3 One Zone-IA.
S3 Glacier Instant Retrieval,
S3 Glacier Flexible Retrieval, and
S3 Glacier Deep Archive.
S3 Express One Zone
Amazon S3 Express One Zone is a high-performance, single-zone Amazon S3 storage class that is purpose-built to deliver consistent, single-digit millisecond data access for your most latency-sensitive applications. S3 Express One Zone is the lowest latency cloud object storage class available today, with data access speeds up to 10x faster and with request costs 50 percent lower than S3 Standard. S3 Express One Zone is the first S3 storage class where you can select a single Availability Zone with the option to co-locate your object storage with your compute resources, which provides the highest possible access speed.
Additionally, to further increase access speed and support hundreds of thousands of requests per second, data is stored in a new bucket type: an Amazon S3 directory bucket. For more information, see What is S3 Express One Zone? and Directory buckets.
S3 Intelligent-Tiering
You can store data with changing or unknown access patterns in S3 Intelligent-Tiering, which optimizes storage costs by automatically moving your data between four access tiers when your access patterns change. These four access tiers include two low-latency access tiers optimized for frequent and infrequent access, and two opt-in archive access tiers designed for asynchronous access for rarely accessed data.
Storage management
Amazon S3 has storage management features that you can use to manage costs, meet regulatory requirements, reduce latency, and save multiple distinct copies of your data for compliance requirements.
S3 Lifecycle – Configure a lifecycle configuration to manage your objects and store them cost effectively throughout their lifecycle. You can transition objects to other S3 storage classes or expire objects that reach the end of their lifetimes.
S3 Object Lock – Prevent Amazon S3 objects from being deleted or overwritten for a fixed amount of time or indefinitely. You can use Object Lock to help meet regulatory requirements that require write-once-read-many (WORM) storage or to simply add another layer of protection against object changes and deletions.
S3 Replication – Replicate objects and their respective metadata and object tags to one or more destination buckets in the same or different AWS Regions for reduced latency, compliance, security, and other use cases.
S3 Batch Operations – Manage billions of objects at scale with a single S3 API request or a few clicks in the Amazon S3 console. You can use Batch Operations to perform operations such as Copy, Invoke AWS Lambda function, and Restore on millions or billions of objects.
Access management and security
Amazon S3 provides features for auditing and managing access to your buckets and objects. By default, S3 buckets and the objects in them are private. You have access only to the S3 resources that you create. To grant granular resource permissions that support your specific use case or to audit the permissions of your Amazon S3 resources, you can use the following features.
S3 Block Public Access
Block public access to S3 buckets and objects. By default, Block Public Access settings are turned on at the bucket level. We recommend that you keep all Block Public Access settings enabled unless you know that you need to turn off one or more of them for your specific use case. For more information, see Configuring block public access settings for your S3 buckets.
AWS Identity and Access Management (IAM)
IAM is a web service that helps you securely control access to AWS resources, including your Amazon S3 resources. With IAM, you can centrally manage permissions that control which AWS resources users can access. You use IAM to control who is authenticated (signed in) and authorized (has permissions) to use resources.
AccessKey、SecretKey
若需要使用 S3 服务,则需要 S3 颁发的 AccessKey(长度为 20 个字符的 ASCII 字符串)和 SecretKey(长度为 40 个字符的 ASCII 字符串)。AccessKey 用于标识客户的身份,SecretKey 作为私钥形式存放于客户服务器,不在网络中传递。SecretKey 通常用作计算请求签名的密钥,用以保证该请求是来自指定的客户的。使用 AccessKey 进行身份识别,加上 SecretKey 进行数字签名,即可完成应用接入与认证授权;
Bucket policies
Use IAM-based policy language to configure resource-based permissions for your S3 buckets and the objects in them.
Amazon S3 access points
Configure named network endpoints with dedicated access policies to manage data access at scale for shared datasets in Amazon S3.
Access control lists (ACLs)
Grant read and write permissions for individual buckets and objects to authorized users. As a general rule, we recommend using S3 resource-based policies (bucket policies and access point policies) or IAM user policies for access control instead of ACLs. Policies are a simplified and more flexible access control option. With bucket policies and access point policies, you can define rules that apply broadly across all requests to your Amazon S3 resources. For more information about the specific cases when you’d use ACLs instead of resource-based policies or IAM user policies, see Access policy guidelines.
S3 Object Ownership
Take ownership of every object in your bucket, simplifying access management for data stored in Amazon S3. S3 Object Ownership is an Amazon S3 bucket-level setting that you can use to disable or enable ACLs. By default, ACLs are disabled. With ACLs disabled, the bucket owner owns all the objects in the bucket and manages access to data exclusively by using access-management policies.
IAM Access Analyzer for S3
Evaluate and monitor your S3 bucket access policies, ensuring that the policies provide only the intended access to your S3 resources.
Data processing
To transform data and trigger workflows to automate a variety of other processing activities at scale, you can use the following features.
S3 Object Lambda
S3 Object Lambda – Add your own code to S3 GET, HEAD, and LIST requests to modify and process data as it is returned to an application. Filter rows, dynamically resize images, redact confidential data, and much more.
Event notifications
Event notifications – Trigger workflows that use Amazon Simple Notification Service (Amazon SNS), Amazon Simple Queue Service (Amazon SQS), and AWS Lambda when a change is made to your S3 resources.
Storage logging and monitoring
Amazon S3 provides logging and monitoring tools that you can use to monitor and control how your Amazon S3 resources are being used. For more information, see Monitoring tools.
Automated monitoring tools
Amazon CloudWatch metrics for Amazon S3 – Track the operational health of your S3 resources and configure billing alerts when estimated charges reach a user-defined threshold.
AWS CloudTrail – Record actions taken by a user, a role, or an AWS service in Amazon S3. CloudTrail logs provide you with detailed API tracking for S3 bucket-level and object-level operations.
Manual monitoring tools
Server access logging – Get detailed records for the requests that are made to a bucket. You can use server access logs for many use cases, such as conducting security and access audits, learning about your customer base, and understanding your Amazon S3 bill.
AWS Trusted Advisor – Evaluate your account by using AWS best practice checks to identify ways to optimize your AWS infrastructure, improve security and performance, reduce costs, and monitor service quotas. You can then follow the recommendations to optimize your services and resources.
Analytics and insights
Amazon S3 offers features to help you gain visibility into your storage usage, which empowers you to better understand, analyze, and optimize your storage at scale.
Amazon S3 Storage Lens – Understand, analyze, and optimize your storage. S3 Storage Lens provides 60+ usage and activity metrics and interactive dashboards to aggregate data for your entire organization, specific accounts, AWS Regions, buckets, or prefixes.
Storage Class Analysis – Analyze storage access patterns to decide when it’s time to move data to a more cost-effective storage class.
S3 Inventory with Inventory reports – Audit and report on objects and their corresponding metadata and configure other Amazon S3 features to take action in Inventory reports. For example, you can report on the replication and encryption status of your objects. For a list of all the metadata available for each object in Inventory reports, see Amazon S3 Inventory list.
Strong consistency
Amazon S3 provides strong read-after-write consistency for PUT and DELETE requests of objects in your Amazon S3 bucket in all AWS Regions. This behavior applies to both writes of new objects as well as PUT requests that overwrite existing objects and DELETE requests. In addition, read operations on Amazon S3 Select, Amazon S3 access control lists (ACLs), Amazon S3 Object Tags, and object metadata (for example, the HEAD object) are strongly consistent. For more information, see Amazon S3 data consistency model.
How Amazon S3 works
Amazon S3 is an object storage service that stores data as objects within buckets. An object is a file and any metadata that describes the file. A bucket is a container for objects.
To store your data in Amazon S3, you first create a bucket and specify a bucket name and AWS Region. Then, you upload your data to that bucket as objects in Amazon S3. Each object has a key (or key name), which is the unique identifier for the object within the bucket.
S3 provides features that you can configure to support your specific use case. For example, you can use S3 Versioning to keep multiple versions of an object in the same bucket, which allows you to restore objects that are accidentally deleted or overwritten.
Buckets and the objects in them are private and can be accessed only if you explicitly grant access permissions. You can use bucket policies, AWS Identity and Access Management (IAM) policies, access control lists (ACLs), and S3 Access Points to manage access.
Buckets
Bucket 是存放 Object 的容器,所有的 Object 都必须存放在特定的 Bucket 中。在 RGW 中默认每个用户最多可以创建 1000 个 Bucket,每个 Bucket 中可以存放无限多个 Object。Bucket 不能嵌套,每个 Bucket 中只能存放 Object,不能再存放 Bucket,Bucket 下的 Object 是一个平级的结构。Bucket 的名称全局唯一且命名规则与 DNS 命名规则相同;
关于 Bucket 的命名规范如下:
1)仅包含小写英文字母(a~z)、数字(0~9)、中线(–),即:abcdefghijklmnopqrstuvwxyz0123456789-;
2)必须由字母或数字开头;
3)长度在 3~255 个字符之间;
A bucket is a container for objects stored in Amazon S3. You can store any number of objects in a bucket and can have up to 100 buckets in your account. To request an increase, visit the Service Quotas console.
Every object is contained in a bucket. For example, if the object named photos/puppy.jpg is stored in the DOC-EXAMPLE-BUCKET bucket in the US West (Oregon) Region, then it is addressable by using the URL https://DOC-EXAMPLE-BUCKET.s3.us-west-2.amazonaws.com/photos/puppy.jpg. For more information, see Accessing a Bucket.
When you create a bucket, you enter a bucket name and choose the AWS Region where the bucket will reside. After you create a bucket, you cannot change the name of the bucket or its Region. Bucket names must follow the bucket naming rules. You can also configure a bucket to use S3 Versioning or other storage management features.
Buckets also:
Organize the Amazon S3 namespace at the highest level.
Identify the account responsible for storage and data transfer charges.
Provide access control options, such as bucket policies, access control lists (ACLs), and S3 Access Points, that you can use to manage access to your Amazon S3 resources.
Serve as the unit of aggregation for usage reporting.
For more information about buckets, see Buckets overview.
Object(对象,文件)
在 S3 中,用户操作的基本数据单元是 Object。单个 Object 允许存储 0~5TB 的数据。
Object 包含 key 和 data:其中,key 是 Object 的名字;data 是 Object 的数据。key 为 UTF-8 编码,且编码后的长度不得超过 1024 个字节;
Objects are the fundamental entities stored in Amazon S3. Objects consist of object data and metadata. The metadata is a set of name-value pairs that describe the object. These pairs include some default metadata, such as the date last modified, and standard HTTP metadata, such as Content-Type. You can also specify custom metadata at the time that the object is stored.
An object is uniquely identified within a bucket by a key (name) and a version ID (if S3 Versioning is enabled on the bucket). For more information about objects, see Amazon S3 objects overview.
Key(文件名)
即 Object 的名字,key 为 UTF-8 编码,且编码后的长度不得超过 1024 个字节。Key 中可以带有斜杠,当 Key 中带有斜杠的时候,将会自动在控制台里组织成目录结构;
An object key (or key name) is the unique identifier for an object within a bucket. Every object in a bucket has exactly one key. The combination of a bucket, object key, and optionally, version ID (if S3 Versioning is enabled for the bucket) uniquely identify each object. So you can think of Amazon S3 as a basic data map between “bucket + key + version” and the object itself.
Every object in Amazon S3 can be uniquely addressed through the combination of the web service endpoint, bucket name, key, and optionally, a version. For example, in the URL https://DOC-EXAMPLE-BUCKET.s3.us-west-2.amazonaws.com/photos/puppy.jpg, DOC-EXAMPLE-BUCKET is the name of the bucket and photos/puppy.jpg is the key.
For more information about object keys, see Creating object key names.
S3 Versioning
You can use S3 Versioning to keep multiple variants of an object in the same bucket. With S3 Versioning, you can preserve, retrieve, and restore every version of every object stored in your buckets. You can easily recover from both unintended user actions and application failures.
For more information, see Using versioning in S3 buckets.
Version ID
When you enable S3 Versioning in a bucket, Amazon S3 generates a unique version ID for each object added to the bucket. Objects that already existed in the bucket at the time that you enable versioning have a version ID of null. If you modify these (or any other) objects with other operations, such as CopyObject and PutObject, the new objects get a unique version ID.
For more information, see Using versioning in S3 buckets.
Bucket policy
A bucket policy is a resource-based AWS Identity and Access Management (IAM) policy that you can use to grant access permissions to your bucket and the objects in it. Only the bucket owner can associate a policy with a bucket. The permissions attached to the bucket apply to all of the objects in the bucket that are owned by the bucket owner. Bucket policies are limited to 20 KB in size.
Bucket policies use JSON-based access policy language that is standard across AWS. You can use bucket policies to add or deny permissions for the objects in a bucket. Bucket policies allow or deny requests based on the elements in the policy, including the requester, S3 actions, resources, and aspects or conditions of the request (for example, the IP address used to make the request). For example, you can create a bucket policy that grants cross-account permissions to upload objects to an S3 bucket while ensuring that the bucket owner has full control of the uploaded objects. For more information, see Bucket policy examples.
In your bucket policy, you can use wildcard characters on Amazon Resource Names (ARNs) and other values to grant permissions to a subset of objects. For example, you can control access to groups of objects that begin with a common prefix or end with a given extension, such as .html.
S3 提供 Bucket 级别的权限访问控制(ACL),Bucket 目前有以下 3 种访问权限:public-read-write、public-read 和 private,它们的含义如下;
public-read-write:任何人(包括匿名访问)都可以对该 Bucket 中的 Object 进行 PUT、Get 和 Delete 操作;
public-read:任何人(包括匿名访问)只能对该 Bucket 中的 Object 进行读操作,而不能进行写操作。注意,对 Bucket 有读操作不表示对 Object 有读操作;
private:只有该 Bucket 的创建者才可以对该 Bucket 及 Bucket 中的 Object 进行读写操作;
S3 Access Points
Amazon S3 Access Points – Amazon Web Services
客户越来越多地使用 Amazon S3 来存储共享数据集,其中数据由不同的应用程序、团队和个人聚合和访问,无论是用于分析、机器学习、实时监控还是其他数据湖用例。管理对此共享存储桶的访问需要一个存储桶策略,该策略可以控制数十到数百个具有不同权限级别的应用程序的访问。随着应用程序集的增长,存储桶策略变得更加复杂,管理起来非常耗时,并且需要进行审核以确保更改不会对其他应用程序产生意外影响。
Amazon S3 Access Point 是 S3 的一项功能,可简化在 S3 中存储数据的任何 AWS 服务或客户应用程序的数据访问。借助 S3 Access Point,客户可以为每个 Access Point 创建独特的访问控制策略,以轻松控制对共享数据集的访问。拥有共享数据集(包括数据湖、媒体档案和用户生成的内容)的客户可以通过创建个性化的 Access Point(并为每个应用程序定制名称和权限)轻松扩展对数百个应用程序的访问。
Access control lists (ACLs)
ACL(访问控制权限)
You can use ACLs to grant read and write permissions to authorized users for individual buckets and objects. Each bucket and object has an ACL attached to it as a subresource. The ACL defines which AWS accounts or groups are granted access and the type of access. ACLs are an access control mechanism that predates IAM. For more information about ACLs, see Access control list (ACL) overview.
S3 Object Ownership is an Amazon S3 bucket-level setting that you can use to both control ownership of the objects that are uploaded to your bucket and to disable or enable ACLs. By default, Object Ownership is set to the Bucket owner enforced setting, and all ACLs are disabled. When ACLs are disabled, the bucket owner owns all the objects in the bucket and manages access to them exclusively by using access-management policies.
A majority of modern use cases in Amazon S3 no longer require the use of ACLs. We recommend that you keep ACLs disabled, except in unusual circumstances where you need to control access for each object individually. With ACLs disabled, you can use policies to control access to all objects in your bucket, regardless of who uploaded the objects to your bucket. For more information, see Controlling ownership of objects and disabling ACLs for your bucket.
对 Bucket 和 Object 相关访问的控制策略,例如允许匿名用户公开访问等。目前 ACL 支持 READ、WRITE、FULL_CONTROL 三种权限。对于 Bucket 的拥有者,总是 FULL_CONTROL。可以授予所有用户(包括匿名用户)或指定用户 READ、WRITE 或者 FULL_CONTROL 权限;
目前提供了 3 种预设的 ACL,分别是 private、public-read、public-read-write;
1)private 表示只有 owner 有 READ 和 WRITE 的权限;
2)public-read 表示为所有用户授予 READ 的权限;
3)public-read-write 表示为所有用户授予 WRITE 权限;
对于 BUCKET 来说,READ 是指能够罗列 Bucket 中的 Object、已经上传的分段。WRITE 是指可以上传或者删除 BUCKET 中 Object。FULL_CONTROL 则包含前面提到的针对 Bucket 的 READ 和 WRITE 两种操作。对于 Object 来说,READ 是指能够查看或者下载对应的 Object。WRITE 是指可以写入、覆盖或删除 Object。FULL_CONTROL 则包含前面提到的针对 Object 的 READ 和 WRITE 两种操作;
Obejct 访问控制权限
S3 提供 Bucket 级别的权限访问控制(ACL),Object 目前有以下 2 种访问权限:public-read 和 private,它们的含义如下;
public-read:任何人(包括匿名访问)都可以对该 Object 进行读操作(即下载);
private:只有 Object 的拥有者可以对该 Object 进行操作;
Region(区域)
当创建 Bucket 时,需要选择 Region 参数,该参数一般用于标识资源存储的物理位置,比如中国区、欧洲区等;
You can choose the geographical AWS Region where Amazon S3 stores the buckets that you create. You might choose a Region to optimize latency, minimize costs, or address regulatory requirements. Objects stored in an AWS Region never leave the Region unless you explicitly transfer or replicate them to another Region. For example, objects stored in the Europe (Ireland) Region never leave it.
You can access Amazon S3 and its features only in the AWS Regions that are enabled for your account. For more information about enabling a Region to create and manage AWS resources, see Managing AWS Regions in the AWS General Reference.
For a list of Amazon S3 Regions and endpoints, see Regions and endpoints in the AWS General Reference.
Endpoint
Website endpoints – Amazon Simple Storage Service
Amazon Simple Storage Service endpoints and quotas – AWS General Reference
Virtual hosting of buckets – Amazon Simple Storage Service
其为 Region 的外网域名,比如 s3.cn.exmaple.com 域名,表示中国区的 S3 服务对外接入地址。
S3 为每个区域提供 Endpoint,其用于处理各自区域的访问请求。用户通过该 Endpoint 来访问对象存储服务。
REST API Endpoint:针对通过对象存储保存数据的场景中,我们使用 REST API Endpoint 来实现数据的上传下载等等。
Website Endpoint:针对 AWS S3 服务,其提供静态网站托管能力。访问该类型静态站点的地址为 Website Endpoint。
Accessing Amazon S3
You can work with Amazon S3 in any of the following ways:
AWS Management Console
The console is a web-based user interface for managing Amazon S3 and AWS resources. If you’ve signed up for an AWS account, you can access the Amazon S3 console by signing into the AWS Management Console and choosing S3 from the AWS Management Console home page.
AWS Command Line Interface
You can use the AWS command line tools to issue commands or build scripts at your system’s command line to perform AWS (including S3) tasks.
The AWS Command Line Interface (AWS CLI) provides commands for a broad set of AWS services. The AWS CLI is supported on Windows, macOS, and Linux. To get started, see the AWS Command Line Interface User Guide. For more information about the commands for Amazon S3, see s3api and s3control in the AWS CLI Command Reference.
AWS SDKs
AWS provides SDKs (software development kits) that consist of libraries and sample code for various programming languages and platforms (Java, Python, Ruby, .NET, iOS, Android, and so on). The AWS SDKs provide a convenient way to create programmatic access to S3 and AWS. Amazon S3 is a REST service. You can send requests to Amazon S3 using the AWS SDK libraries, which wrap the underlying Amazon S3 REST API and simplify your programming tasks. For example, the SDKs take care of tasks such as calculating signatures, cryptographically signing requests, managing errors, and retrying requests automatically. For information about the AWS SDKs, including how to download and install them, see Tools for AWS.
Every interaction with Amazon S3 is either authenticated or anonymous. If you are using the AWS SDKs, the libraries compute the signature for authentication from the keys that you provide. For more information about how to make requests to Amazon S3, see Making requests.
Amazon S3 REST API
The architecture of Amazon S3 is designed to be programming language-neutral, using AWS-supported interfaces to store and retrieve objects. You can access S3 and AWS programmatically by using the Amazon S3 REST API. The REST API is an HTTP interface to Amazon S3. With the REST API, you use standard HTTP requests to create, fetch, and delete buckets and objects.
To use the REST API, you can use any toolkit that supports HTTP. You can even use a browser to fetch objects, as long as they are anonymously readable.
The REST API uses standard HTTP headers and status codes, so that standard browsers and toolkits work as expected. In some areas, we have added functionality to HTTP (for example, we added headers to support access control). In these cases, we have done our best to add the new functionality in a way that matches the style of standard HTTP usage.
If you make direct REST API calls in your application, you must write the code to compute the signature and add it to the request. For more information about how to make requests to Amazon S3, see Making requests.
Paying for Amazon S3
Pricing for Amazon S3 is designed so that you don’t have to plan for the storage requirements of your application. Most storage providers require you to purchase a predetermined amount of storage and network transfer capacity. In this scenario, if you exceed that capacity, your service is shut off or you are charged high overage fees. If you do not exceed that capacity, you pay as though you used it all.
Amazon S3 charges you only for what you actually use, with no hidden fees and no overage charges. This model gives you a variable-cost service that can grow with your business while giving you the cost advantages of the AWS infrastructure. For more information, see Amazon S3 Pricing.
When you sign up for AWS, your AWS account is automatically signed up for all services in AWS, including Amazon S3. However, you are charged only for the services that you use. If you are a new Amazon S3 customer, you can get started with Amazon S3 for free. For more information, see AWS free tier.
To see your bill, go to the Billing and Cost Management Dashboard in the AWS Billing and Cost Management console. To learn more about AWS account billing, see the AWS Billing User Guide. If you have questions concerning AWS billing and AWS accounts, contact AWS Support.
Service(服务)
S3 提供给用户的虚拟存储空间,在这个虚拟空间中,每个用户可拥有一个到多个 Bucket;