清除数据和缓存是什么
In general terms, a cache (pronounced "cash") is a type of repository. You can think of a repository as a storage depot. In the military, this would be to hold weapons, food, and other supplies needed to carry forward a mission.
一般而言, 缓存 (读作“现金”)是一种存储库。 您可以将存储库视为存储库。 在军队中,这将是执行任务所需的武器,食物和其他用品。
In computer science, these "supplies" are termed resources, where the resources are scripts, code, and document content. The latter is sometimes more specifically referred to as "assets" such as text, static data, media, and hyperlinks, but here I'll just use the one term resources.
在计算机科学中,这些“供应”被称为资源,其中资源是脚本,代码和文档内容。 后者有时被更具体地称为“资产”,例如文本,静态数据,媒体和超链接,但是在这里我仅使用一个术语“ 资源” 。
A cache's primary purpose is to speed up retrieval of web page resources, decreasing page load times. Another critical aspect of a cache is to ensure that it contains relatively fresh data.
缓存的主要目的是加快对网页资源的检索,减少页面加载时间。 缓存的另一个关键方面是确保它包含相对较新的数据。
This article will cover two prevalent methods of caching: browser caching and Content Delivery Networks (CDNs).
本文将介绍两种流行的缓存方法: 浏览器缓存和内容分发网络 (CDN)。
Besides caches, other repositories come into play in web architectures; often these are designed to hold vast troves of data. They are not as focussed, though, on retrieval performance.
除了缓存,其他存储库在Web体系结构中也起作用。 通常,这些数据包旨在容纳大量数据。 但是,它们并没有集中在检索性能上。
For example, Amazon Glacier is a data repository that is designed to store data cheaply, but not retrieve it quickly. An SQL database, on the other hand, is designed to be flexible, up-to-date, and fast, but is seldom cheap and not usually as fast as a cache.
例如,Amazon Glacier是一个数据存储库,旨在廉价地存储数据,但不能快速检索。 另一方面,SQL数据库被设计为灵活,最新和快速的,但很少便宜,通常不如高速缓存。
A memory cache stores resources locally on the computer where the browser is running. While the browser is active, retrieved resources will be stored on the computer's physical memory (RAM), and possibly also on hard drive.
内存缓存将资源本地存储在运行浏览器的计算机上。 当浏览器处于活动状态时,检索到的资源将存储在计算机的物理内存(RAM)中,并且可能还会存储在硬盘驱动器中。
Later, when the exact same resources are needed when revisiting a web page, the browser will pull those from the cache instead of the remote server. Since the cache is stored locally, in fast memory, those resources are fetched quicker, and the page loads faster.
稍后,当重新访问网页时需要完全相同的资源时,浏览器将从缓存中而不是远程服务器中提取资源。 由于缓存存储在本地,因此在快速内存中,可以更快地获取那些资源,并且页面加载速度也更快。
Speed of resource retrieval is of the essence, but so is the necessity that the resources be fresh. A stale resource is one that is out-of-date and may no longer be valid.
资源检索的速度至关重要,但是资源必须新鲜。 陈旧的资源是已过期的资源,可能不再有效。
Part of the job of the browser is to identify which cached resources are stale, and refetch those that are. Since a web page typically has may resources, there will usually be a mix of stale and fresh versions in the cache.
浏览器的一部分工作是识别哪些缓存资源已过时,并重新获取那些缓存资源。 由于网页通常具有可能的资源,因此缓存中通常会混合使用过时的版本和最新的版本。
The answer is not simple, but there are two main approaches: cache-busting and HTTP header fields.
答案并不简单,但是有两种主要方法:缓存清除和HTTP标头字段。
Cache-busting is a server-side technique that ensure that the browser only fetches fresh resources. It does this indirectly.
缓存清除是一种服务器端技术,可确保浏览器仅获取新鲜资源。 它间接执行此操作。
While cache-busting may sound dramatic, it really doesn't bust anything, and doesn't even touch what is already cached on a browser. All cache-busting does is change the original resource's URI in a way that makes it appear to the browser that the resource is completely new. Since it looks new, it will not be in a browser's cache. The old version of the cached resource will still be cached, but eventually will wither and die, never to be accessed again.
尽管缓存清除听起来很引人注目,但它实际上并没有破坏任何内容,甚至不涉及浏览器中已经缓存的内容。 清除缓存所做的所有工作都是以一种使原始资源的URI改变的方式,使浏览器认为该资源是全新的。 由于它看起来很新,因此不会位于浏览器的缓存中。 缓存的资源的旧版本仍将被缓存,但最终将枯萎而死,不再被访问。
Say I have a web page located at www.foobar.com/about.html
which says everything about foobar.com that you would ever want to know. Once you visit that page, it and the resources associated with it are cached by the browser.
假设我有一个位于www.foobar.com/about.html
的网页,其中包含您想了解的有关foobar.com的所有信息。 访问该页面后,浏览器将缓存该页面及其相关资源。
Later, foobar.com is bought out by the Quxbaz corporation, and the about page's content undergoes significant changes. The browser's cache won't have that new content, yet it may still believe the content it has is current and will never try to refetch it.
后来,foobar.com被Quxbaz公司收购,并且about页面的内容发生了重大变化。 浏览器的缓存不会包含该新内容,但仍可能会认为该内容是当前内容,并且永远不会尝试重新获取它。
What do you, the Quxbaz web administrator, do to ensure all new content is pushed out?
Quxbaz网站管理员,您如何确保所有新内容被推出?
Since the browser relies on the URI to find items in the cache, if the URI of a resource changes then it's like the browser has never seen it before it goes to fetch that resource from the server.
由于浏览器依靠URI在缓存中查找项目,因此,如果资源的URI发生更改,则好像浏览器从未见过它,然后再从服务器中获取该资源。
Thus, by changing the resource URI from www.foobar.com/about.html
to www.foobar.com/about2.html
(or to www.quxbaz.com/about.html
), the browser will not find any cache resource associated with that URI, and do a full fetch from the server. The resource might be substantially the same as the original under the old URI, but the browser doesn't know that.
因此,通过将资源URI从www.foobar.com/about.html
为www.foobar.com/about2.html
(或更改为www.quxbaz.com/about.html
),浏览器将找不到任何关联的缓存资源使用该URI,然后从服务器获取完整信息。 该资源可能与旧URI下的原始资源基本相同,但是浏览器并不知道。
You don't have to change the page name, though. Since the URI also includes a query string by definition, you can add a version parameter to the URI: www.foobar.com/about.html?v=2hef9eb1
.
不过,您不必更改页面名称。 由于URI根据定义还包括查询字符串,因此您可以向URI添加版本参数: www.foobar.com/about.html?v=2hef9eb1
。
In this case, the version parameter v is set new a new generated hash value whenever the content changes, or is triggered by some other process, such as a server restart. The browser sees that the query string has changed, and because query strings can affect what will be returned, it will fetch an up-to-date resource from the server.
在这种情况下,只要内容发生更改或由其他过程(例如服务器重新启动)触发,就会将版本参数v设置为新的新生成的哈希值。 浏览器发现查询字符串已更改,并且由于查询字符串可能会影响返回的内容,因此它将从服务器获取最新资源。
Neither of these techniques will work if the old URI is directly accessed from a bookmark. Unless the browser was instructed to revalidate the URI on the last cached request (or the cached resource expired), it won't do a full fetch to refresh its cache. This brings us to the next topic.
如果直接从书签访问旧的URI,则这些技术均无效。 除非指示浏览器重新验证上次缓存的请求上的URI(或缓存的资源已过期),否则它不会进行完全读取来刷新其缓存。 这将我们带入下一个主题。
Every resource request come with some meta information known as the header. Conversely, every response also has header information associated with it.
每个资源请求都带有一些称为标头的元信息。 相反,每个响应还具有与之关联的标头信息。
In some cases, the browser sees the response header values, and changes corresponding values in subsequent request headers. Among these header values are those that affect how resource caching is performed on the browser.
在某些情况下,浏览器会看到响应标头值,并在后续请求标头中更改相应的值。 这些头值中的那些值会影响在浏览器上执行资源缓存的方式。
A HEAD request is like a truncated GET or a POST request. Instead of requesting the complete resource, a HEAD request only requests the header fields that would otherwise be returned on a full request.
HEAD请求就像是截断的GET或POST请求。 HEAD请求不是请求完整的资源,而是仅请求标头字段,否则将在完整请求上返回该标头字段。
The header of a resource is generally going to be much smaller (in number of total bytes) than the resource data associated with it (the "body" of the response). The header information is sufficiently informative to allow the browser to determine the freshness of the resource in its cache.
通常,资源的标头(以总字节数为单位)比与其关联的资源数据(响应的“正文”)要小得多。 标头信息足够提供信息,以允许浏览器确定其缓存中资源的新鲜度。
HEAD requests are often used to verify the validity of a server resource (that is, does the resource still exist, and if so, has it been updated since the browser last accessed it?). The browser will use what's in its cache if the HEAD request indicates the resource is valid, otherwise it will perform a full GET or POST request and refresh its cache with what is returned.
HEAD请求通常用于验证服务器资源的有效性(也就是说,该资源是否仍然存在,如果存在,自浏览器上次访问该资源以来是否已对其进行更新?)。 如果HEAD请求指示资源有效,则浏览器将使用其缓存中的内容,否则将执行完整的GET或POST请求,并使用返回的内容刷新其缓存。
With a conditional request, the browser sends fields in the header describing the freshness of its cached resource. This time, the server determines if the browser's cache is still fresh.
通过条件请求 ,浏览器将在标头中发送字段,以描述其缓存资源的新鲜度。 这次,服务器确定浏览器的缓存是否仍然新鲜。
If it is, the server returns a 304 response with just the resource's header information, and no resource body (the data). If the browser's cache is determined to be outdated, then the server will return a full 200 OK response.
如果是,则服务器返回304响应,其中仅包含资源的头信息,而没有资源主体(数据)。 如果确定浏览器的缓存已过期,则服务器将返回完整的200 OK响应。
This mechanism is faster than using HEAD requests, since it eliminates the possibility of having to issue two requests instead of one.
该机制比使用HEAD请求更快,因为它消除了必须发出两个请求而不是发出一个请求的可能性。
The above simplifies what can be a pretty complicated process. There's a lot of fine-tuning involved in caching, but it all is controlled through header fields, the most important of which is cache-control.
以上简化了一个相当复杂的过程。 缓存涉及很多微调,但是所有这些都是通过头字段控制的,其中最重要的是缓存控制。
When responding to a request, the server will send header fields to the browser indicating what behavior is should adapt when caching. If I load the page at https://en.wikipedia.org/wiki/Uniform_Resource_Identifier
, the response contains this in its header record:
响应请求时,服务器会将标头字段发送到浏览器,指示在缓存时应适应什么行为。 如果我将页面加载到https://en.wikipedia.org/wiki/Uniform_Resource_Identifier
,则响应在其标题记录中包含此内容:
cache-control: private, s-maxage=0, max-age=0, must-revalidate
private means that only the browser should cache the document content.
私有意味着仅浏览器应缓存文档内容。
s-maxage and max-age are set to 0. The s-maxage value is for proxy servers with caches, whereas max-age is intended for the browser. The effect of setting max-age alone is that the cached resource expires immediately, yet it may still be used (even though stale) during page reloads while in the same browser session.
s-maxage和max-age设置为0 。 s-maxage值适用于具有缓存的代理服务器,而max-age适用于浏览器。 单独设置max-age的效果是,缓存的资源会立即过期,但是在同一浏览器会话中的页面重新加载期间,它仍然可以使用(即使过时)。
A stale resource may be revalidation through a HEAD request, which might be followed by a GET or POST request, depending on the response. The must-revalidate directive commands the browser to revalidate the cached resource if it is stale.
过时的资源可能会通过HEAD请求进行重新验证,取决于响应,该请求可能后跟GET或POST请求。 must-revalidate指令命令浏览器重新验证缓存的资源(如果陈旧)。
Since max-age is set to 0 in this case, the cached resource is immediately stale once received. The combination of the two directives is equivalent to the single directive no-cache.
由于在这种情况下max-age设置为0 ,因此一旦接收到缓存的资源,该资源将立即失效。 这两个指令的组合等效于单个指令no-cache 。
The two settings ensure that the browser always revalidates the cached resource, whether still in the same session or not.
这两个设置可确保浏览器始终重新验证缓存的资源,无论是否仍在同一会话中。
Cache-control directives are very extensive, and at times confusing – they're a topic in their own right. A complete documented list of directives can be found here.
缓存控制指令非常广泛,有时会令人困惑–它们本身就是一个主题。 完整的指令文档列表可在此处找到。
This is a token that the server sends and the browser retains until the next request. This is only used when the browser knows that the resource's cache lifetime has expired.
这是服务器发送的令牌,浏览器保留该令牌直到下一个请求。 仅当浏览器知道资源的缓存生存期已到期时才使用此方法。
E-tags are server-generated hash values, which often use the resource's physical file name and last modified date on the server as a seed. When a resource file is updated, the modified date changes, and a new hash value is generated and sent in the response header to the request.
电子标签是服务器生成的哈希值,通常将资源的物理文件名和服务器上的上次修改日期用作种子。 更新资源文件时,修改的日期会更改,并且会生成新的哈希值,并将其在响应标头中发送到请求。
The header tags expires and last-modified are all but obsolete, yet are still sent by most servers for backward compatibility with older browsers. An example:
标头标记已过期 , 最后修改的标记几乎已过时,但大多数服务器仍会发送这些标记,以实现与旧版浏览器的向后兼容性。 一个例子:
expires: Thu, 01 Jan 1970 00:00:00 GMT
last-modified: Sun, 01 Mar 2020 17:59:02 GMT
Here, the expires is set to the zeroth date (historically, from the UNIX operating system). That indicates that the resource expires immediately, just as max-age=0 does. Last-modified tells the browser when the latest update was made to the resource, which it can then use to decide if it should refetch it rather than use the cache value.
在这里,过期时间设置为第零个日期(从UNIX操作系统开始,历史上一直如此)。 这表明资源立即过期,就像max-age = 0一样。 上次修改时间告诉浏览器何时对资源进行了最新更新,然后可以使用它来决定是否应重新获取资源,而不是使用缓存值。
A hard reload forces the refetch of all resources on a page, whether they're content, scripts, stylesheets or media. Pretty much everything, right?
硬重新加载会强制重新获取页面上的所有资源,无论它们是内容,脚本,样式表还是媒体。 几乎所有的东西,对不对?
Well, some resources are may not be explicitly included on a page. Instead, they can be fetched dynamically, usually after everything explicit has loaded.
好吧,某些资源可能未明确包含在页面上。 取而代之的是,通常可以在显式加载所有内容之后,动态获取它们。
The browser doesn't know ahead of time that this will happen, and when it does, the later requests (initiated by scripts, usually) will still use cached copies of those resources if available.
浏览器事先不知道会发生这种情况,并且当发生这种情况时,以后的请求(通常由脚本启动)将仍然使用这些资源的缓存副本(如果有)。
This operation clears the entire browser cache, which has the same effect as a hard reload, but additionally causes dynamically loaded resources to be fetched as well – after all, there's nothing in the cache, so there is no choice!
此操作将清除整个浏览器缓存,其效果与硬重载相同,但同时也会导致获取动态加载的资源–毕竟,缓存中没有任何内容,因此别无选择!
A CDN is more than just a cache, but caching is one of its jobs. A CDN stores data in geographically distributed locations so that round-trip times to and from a geographically local browser are reduced.
CDN不仅仅是缓存,但缓存是其工作之一。 CDN将数据存储在地理位置分散的位置,从而减少了往返于地理位置本地浏览器的往返时间。
Browser requests are routed to a nearby CDN, thereby shortening the physical distance response data has to travel. CDNs also are able to handle large amounts of traffic, and provide security against some types of attacks.
浏览器请求被路由到附近的CDN,从而缩短了必须传输的物理距离响应数据。 CDN还能够处理大量流量,并提供针对某些类型攻击的安全性。
A CDN gets its resources through an Internet Exchange Point (IXP), nodes that are part of the backbone of The Internet (in caps). There are steps to take to set up request routing to go to a CDN instead of the host server. The next step is to make sure the CDN has the current content of your website.
CDN通过Internet交换点(IXP)来获取其资源,IXP是Internet的骨干网的一部分(上限)。 需要采取一些步骤来设置请求路由以转到CDN而不是主机服务器。 下一步是确保CDN具有您网站的当前内容。
In the old days, most CDNs supported the push method: a website would push new content to a CDN hub, which would then get distributed to geographically dispersed nodes.
在过去,大多数CDN都支持推送方法:网站会将新内容推送到CDN集线器,然后将其分发到地理位置分散的节点。
Nowadays, most CDNs use the caching protocols described above (or similar) to 1) download new resources, and 2) refresh existing ones. The browser still has its cache, and none of that changes. All a CDN does is make those transfers of new resources faster.
如今,大多数CDN都使用上述(或类似的)缓存协议来1)下载新资源,以及2)刷新现有资源。 浏览器仍然具有其缓存,并且没有任何改变。 CDN所做的只是使新资源的传输更快。
翻译自: https://www.freecodecamp.org/news/what-is-cached-data/
清除数据和缓存是什么