Elasticsearch中的Mapping映射

Mapping

为index中的文档创建的数据结构和相关配置，称为Mapping映射。

精确匹配与全文搜索

ES对不同的类型有不同的存储和检索方式

exact value：精确匹配（如date），在索引的分词阶段, 会将整个value作为一个关键词建立到倒排索引中。
full text 全文检索（如text），对值进行拆分词语后（分词）进行匹配，也可以通过缩写、时态、大小写、同义词等进行匹配

ES自动创建映射

我们在插入数据的时候，如果不指定映射，ES会自动帮我们创建映射。

插入数据

PUT /website/_doc/1
{
  "post_date": "2019-01-01",
  "title": "my first article",
  "content": "this is my first article in this website",
  "author_id": 11400
}

PUT /website/_doc/2
{
  "post_date": "2019-01-02",
  "title": "my second article",
  "content": "this is my second article in this website",
  "author_id": 11400
}
 
PUT /website/_doc/3
{
  "post_date": "2019-01-03",
  "title": "my third article",
  "content": "this is my third article in this website",
  "author_id": 11400
}

查看映射

GET /website/_mapping
{
  "website" : {
    "mappings" : {
      "properties" : {
        "author_id" : {
          "type" : "long"
        },
        "content" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "post_date" : {
          "type" : "date"
        },
        "title" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        }
      }
    }
  }
}

动态映射：dynamic mapping，自动为我们建立index，以及对应的mapping，mapping中包含了每个field对应的数据类型，以及如何分词等设置

尝试各种搜索

GET /website/_search?q=2019        0条结果             
GET /website/_search?q=2019-01-01           1条结果
GET /website/_search?q=post_date:2019-01-01     1条结果
GET /website/_search?q=post_date:2019          0 条结果

搜索结果为什么不一致，因为es自动建立mapping的时候，设置了不同的field不同的data type。不同的data type的分词、搜索等行为是不一样的。所以出现了_all field和post_date field的搜索表现完全不一样。

手动创建Mapping

PUT book/_mapping
{
	"properties": {
           "name": {
                  "type": "text"
            },
           "description": {
              "type": "text",
              "analyzer":"english",  # 分词器
              "search_analyzer":"english" # 指定查询使用的分词器
           },
           "pic":{
             "type":"text",
             "index":false   # 不进行索引
           },
           "studymodel":{
             "type":"text"
           }
    }
}

Text 文本类型

analyzer ：通过analyzer属性指定分词器，上边指定了analyzer是指在索引和搜索都使用english，如果想单独定义搜索时使用的分词器则可以通过search_analyzer属性指定。
index：指定属性是否需要索引，只有进行索引才可以从索引库中搜索到。默认index=true。但是有些情况不需要索引，比如商品图片只是用来展示图片，不进行搜索，此时可以将index设为false。

插入数据

PUT /book/_doc/1
{
  "name":"Bootstrap开发框架",
  "description":"Bootstrap是由Twitter推出的一个前台页面开发框架，在行业之中使用较为广泛。此开发框架包含了大量的CSS、JS程序代码，可以帮助开发者（尤其是不擅长页面开发的程序人员）轻松的实现一个不受浏览器限制的精美界面效果。",
  "pic":"group1/M00/00/01/wKhlQFqO4MmAOP53AAAcwDwm6SU490.jpg",
  "studymodel":"201002"
}

搜索测试：

1
2
3

GET /book/_search?q=name:"开发"
GET /book/_search?q=description:"开发"
GET /book/_search?q=pic:"group1/M00/00/01/wKhlQFqO4MmAOP53AAAcwDwm6SU490.jpg"  # 搜索不到数据

pic的index设置为false，不会进行索引，所以查询不出来

keyword关键字字段

目前已经取代了”index”: false。上边介绍的text文本字段在映射时要设置分词器，keyword字段为关键字字段，通常搜索keyword是按照整体搜索，所以创建keyword字段的索引时是不进行分词的，比如：邮政编码、手机号码、身份证等。keyword字段通常用于过虑、排序、聚合等。

date日期类型

日期类型不用设置分词器。通常日期类型的字段用于排序。

format 通过format设置日期格式

例子：下边的设置允许date字段存储年月日时分秒、年月日及毫秒三种格式。

{
  "properties": {
    "timestamp": {
      "type": "date",
      "format": "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd"
    }
  }
}

插入文档：

POST book/doc/3
{
  "name": "spring开发基础",
  "description": "spring 在java领域非常流行，java程序员都在用。",
  "studymodel": "201001",
  "pic": "group1/M00/00/01/wKhlQFqO4MmAOP53AAAcwDwm6SU490.jpg",
  "timestamp": "2018-07-04 18:28:58"
}

修改映射

只能创建index时手动建立mapping，或者新增field mapping，但是不能update field mapping。

原因：因为已有数据按照映射早已分词存储好。如果修改，会影响存量数据

mapping新增一个字段

PUT /book/_mapping/
{
  "properties" : {
    "new_field" : {
      "type" :    "text",
     "index":    "false"
    }
  }
}

修改mapping

PUT /book/_mapping/
{
  "properties" : {
    "studymodel" : {
     "type" :    "keyword"
    }
  }
}

结果：

{
  "error": {
    "root_cause": [
      {
        "type": "illegal_argument_exception",
        "reason": "mapper [studymodel] of different type, current_type [text], merged_type [keyword]"
      }
    ],
    "type": "illegal_argument_exception",
    "reason": "mapper [studymodel] of different type, current_type [text], merged_type [keyword]"
  },
  "status": 400
}