Java的web开发需要excel的导入导出工具,所以需要一定的工具类实现,如果是使用easypoi、Hutool导入导出excel,会非常的损耗内存,因此可以尝试使用easyexcel解决大数据量的数据的导入导出,且可以通过Java8的函数式编程解决该问题。
使用easyexcel,虽然不太会出现OOM的问题,但是如果是大数据量的情况下也会有一定量的内存溢出的风险,所以我打算从以下几个方面优化这个问题:
使用Java8的函数式编程实现低代码量的数据导入
使用反射等特性实现单个接口导入任意excel
使用线程池实现大数据量的excel导入
maven导入
1 2 3 4 5 6 <dependency > <groupId > com.alibaba</groupId > <artifactId > easyexcel</artifactId > <version > 3.0.5</version > </dependency >
使用泛型实现对象的单个Sheet导入
先实现一个类,用来指代导入的特定的对象
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 @Data @NoArgsConstructor @AllArgsConstructor @TableName("stu_info") @ApiModel("学生信息") public class StuInfo { private static final long serialVersionUID = 1L ; @ColumnWidth(10) @ApiModelProperty("姓名") @ExcelProperty(value = "姓名",order = 0) @ExportHeader(value = "姓名",index = 1) private String name; @ExcelProperty(value = "年龄",order = 1) @ApiModelProperty("年龄") @ExportHeader(value = "年龄",index = 2) private Integer age; @ExcelProperty(value = "身高",order = 2) @ApiModelProperty("身高") @ExportHeader(value = "身高",index = 4) private Double tall; @ExcelProperty(value = "自我介绍",order = 3) @ApiModelProperty("自我介绍") @ExportHeader(value = "自我介绍",index = 3,ignore = true) private String selfIntroduce; @ExcelProperty(value = "图片信息",order = 4) @ApiModelProperty("图片信息") @ExportHeader(value = "图片信息",ignore = true) private Blob picture; @ExcelProperty(value = "性别",order = 5) @ApiModelProperty("性别") private Integer gender; @DateTimeFormat("yyyy-MM-dd HH:mm:ss:") @ExcelProperty(value = "入学时间",order = 6) @ApiModelProperty("入学时间") private String intake; @ExcelProperty(value = "出生日期",order = 7) @ApiModelProperty("出生日期") private String birthday; }
重写ReadListener接口
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 @Slf4j public class UploadDataListener <T> implements ReadListener <T> { private static final int BATCH_COUNT = 100 ; private List<T> cachedDataList = ListUtils.newArrayListWithExpectedSize(BATCH_COUNT); private Predicate<T> predicate; private Consumer<Collection<T>> consumer; public UploadDataListener (Predicate<T> predicate, Consumer<Collection<T>> consumer) { this .predicate = predicate; this .consumer = consumer; } public UploadDataListener (Consumer<Collection<T>> consumer) { this .consumer = consumer; } @Override public void invoke (T data, AnalysisContext context) { if (predicate != null && !predicate.test(data)) { return ; } cachedDataList.add(data); if (cachedDataList.size() >= BATCH_COUNT) { try { consumer.accept(cachedDataList); } catch (Exception e) { log.error("Failed to upload data!data={}" , cachedDataList); throw new BizException ("导入失败" ); } cachedDataList = ListUtils.newArrayListWithExpectedSize(BATCH_COUNT); } } @Override public void doAfterAllAnalysed (AnalysisContext context) { if (CollUtil.isNotEmpty(cachedDataList)) { try { consumer.accept(cachedDataList); log.info("所有数据解析完成!" ); } catch (Exception e) { log.error("Failed to upload data!data={}" , cachedDataList); if (e instanceof BizException) { throw e; } throw new BizException ("导入失败" ); } } } }
Controller层的实现
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 @ApiOperation("只需要一个readListener,解决全部的问题") @PostMapping("/update") @ResponseBody public R<String> aListener4AllExcel (MultipartFile file) throws IOException { try { EasyExcel.read(file.getInputStream(), StuInfo.class, new UploadDataListener <StuInfo>( list -> { service.saveBatch(list); log.info("从Excel导入数据一共 {} 行 " , list.size()); })) .sheet() .doRead(); } catch (IOException e) { log.error("导入失败" , e); throw new BizException ("导入失败" ); } return R.success("SUCCESS" ); }
但是这种方式只能实现已存对象的功能实现,如果要新增一种数据的导入,那我们需要怎么做呢?
可以通过读取成Map,根据顺序导入到数据库中。
通过实现单个Sheet中任意一种数据的导入
Controller层的实现
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 @ApiOperation("只需要一个readListener,解决全部的问题") @PostMapping("/listenMapDara") @ResponseBody public R<String> listenMapDara (@ApiParam(value = "表编码", required = true) @NotBlank(message = "表编码不能为空") @RequestParam("tableCode") String tableCode, @ApiParam(value = "上传的文件", required = true) @NotNull(message = "上传文件不能为空") MultipartFile file) throws IOException { try { EasyExcel.read(file.getInputStream(), new NonClazzOrientedListener ( list -> { log.info("从Excel导入数据一共 {} 行 " , list.size()); })) .sheet() .doRead(); } catch (IOException e) { log.error("导入失败" , e); throw new BizException ("导入失败" ); } return R.success("SUCCESS" ); }
重写ReadListener接口
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 @Slf4j public class NonClazzOrientedListener implements ReadListener <Map<Integer, String>> { private static final int BATCH_COUNT = 100 ; private List<List<Object>> rowsList = ListUtils.newArrayListWithExpectedSize(BATCH_COUNT); private List<Object> rowList = new ArrayList <>(); private Predicate<Map<Integer, String>> predicate; private Consumer<List> consumer; public NonClazzOrientedListener (Predicate<Map<Integer, String>> predicate, Consumer<List> consumer) { this .predicate = predicate; this .consumer = consumer; } public NonClazzOrientedListener (Consumer<List> consumer) { this .consumer = consumer; } private boolean flag = false ; @Override public void invoke (Map<Integer, String> row, AnalysisContext analysisContext) { consumer.accept(rowsList); rowList.clear(); row.forEach((k, v) -> { log.debug("key is {},value is {}" , k, v); rowList.add(v == null ? "" : v); }); rowsList.add(rowList); if (rowsList.size() > BATCH_COUNT) { log.debug("执行存储程序" ); log.info("rowsList is {}" , rowsList); rowsList.clear(); } } @Override public void doAfterAllAnalysed (AnalysisContext analysisContext) { consumer.accept(rowsList); if (CollUtil.isNotEmpty(rowsList)) { try { log.debug("执行最后的程序" ); log.info("rowsList is {}" , rowsList); } catch (Exception e) { log.error("Failed to upload data!data={}" , rowsList); if (e instanceof BizException) { throw e; } throw new BizException ("导入失败" ); } finally { rowsList.clear(); } } }
这种方式可以通过把表中的字段顺序存储起来,通过配置数据和字段的位置实现数据的新增,那么如果出现了导出数据模板/手写excel的时候顺序和导入的时候顺序不一样怎么办?
可以通过读取header进行实现,通过表头读取到的字段,和数据库中表的字段进行比对,只取其中存在的数据进行排序添加
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 @Override public void invokeHead (Map<Integer, ReadCellData<?>> headMap, AnalysisContext context) { Map<Integer, String> columMap = ConverterUtils.convertToStringMap(headMap, context); Map<String, String> columnList = new HashMap (); columMap.forEach((key, value) -> { if (columnList.containsKey(value)) { filterList.add(key); } }); log.info("解析到一条头数据:{}" , JSON.toJSONString(columMap)); }
那么这些问题都解决了,如果出现大数据量的情况,如果要极大的使用到cpu,该怎么做呢?
可以尝试使用线程池进行实现
使用线程池进行多线程导入大量数据
Java中线程池的开发与使用与原理我可以单独写一篇文章进行讲解,但是在这边为了进行好的开发我先给出一套固定一点的方法。
由于ReadListener不能被注册到IOC容器里面,所以需要在外面开启
详情可见Spring Boot通过EasyExcel异步多线程实现大数据量Excel导入,百万数据30秒
通过泛型实现对象类型的导出
1 2 3 4 5 6 7 8 9 10 11 12 public <T> void commonExport (String fileName, List<T> data, Class<T> clazz, HttpServletResponse response) throws IOException { if (CollectionUtil.isEmpty(data)) { data = new ArrayList <>(); } fileName = URLEncoder.encode(fileName, "UTF-8" ); response.setContentType("application/vnd.ms-excel" ); response.setCharacterEncoding("utf-8" ); response.setHeader("Content-disposition" , "attachment;filename=" + fileName + ".xlsx" ); EasyExcel.write(response.getOutputStream()).head(clazz).sheet("sheet1" ).doWrite(data); }
直接使用该方法可以作为公共的数据的导出接口
如果想要动态的下载任意一组数据怎么办呢?可以使用这个方法
1 2 3 4 5 6 7 8 9 10 11 public void exportFreely (String fileName, List<List<Object>> data, List<List<String>> head, HttpServletResponse response) throws IOException { if (CollectionUtil.isEmpty(data)) { data = new ArrayList <>(); } fileName = URLEncoder.encode(fileName, "UTF-8" ); response.setContentType("application/vnd.ms-excel" ); response.setCharacterEncoding("utf-8" ); response.setHeader("Content-disposition" , "attachment;filename=" + fileName + ".xlsx" ); EasyExcel.write(response.getOutputStream()).head(head).sheet("sheet1" ).doWrite(data); }
什么?不仅想一个接口展示全部的数据与信息,还要增加筛选条件?这个后期我可以单独写一篇文章解决这个问题。
今天的分享就到这里了。